mirror of
https://github.com/fabriziosalmi/patterns.git
synced 2026-06-14 16:34:17 -04:00
Redesign docs with Apple-native theme; verify content; route CI to self-hosted runner-02
- VitePress: custom theme (SF system fonts, glass nav, soft surfaces, pill buttons, light/dark code blocks, refined feature cards, platform showcase + stat strip). - Replace every emoji across docs and README with inline SVG icons. - Verify and fix doc accuracy against actual scripts: JSON schema (category+pattern only), env-var configuration for json2*/import_* scripts, owasp2json CLI surface. - Add public assets (logo.svg, favicon.svg, hero-shield.svg) and Shiki haproxy alias. - Workflows default to self-hosted runner-02 with a configurable fallback to GitHub runners via the RUNS_ON repo variable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
259
docs/api.md
259
docs/api.md
@@ -1,223 +1,236 @@
|
||||
# API Reference
|
||||
# API & Scripts Reference
|
||||
|
||||
This page documents the Python scripts that power the Patterns project.
|
||||
Patterns is a small Python toolchain. Every script does one job and communicates with the rest through plain JSON or files on disk — no shared state, no daemon, no database.
|
||||
|
||||
## Core Scripts
|
||||
|
||||
### owasp2json.py
|
||||
|
||||
Fetches and parses OWASP Core Rule Set patterns from GitHub.
|
||||
|
||||
```bash
|
||||
python owasp2json.py
|
||||
```text
|
||||
owasp2json.py ──▶ owasp_rules.json ──▶ json2{nginx,apache,traefik,haproxy}.py
|
||||
└▶ badbots.py (independent)
|
||||
```
|
||||
|
||||
**Output**: `owasp_rules.json`
|
||||
All scripts are configured through **environment variables** (not CLI flags) except `owasp2json.py`, which has a small `argparse` interface.
|
||||
|
||||
**Configuration**:
|
||||
- Uses environment variable `OWASP_REPO` to specify source repository
|
||||
- Default: `coreruleset/coreruleset`
|
||||
## Pipeline scripts
|
||||
|
||||
**Features**:
|
||||
- Fetches latest CRS rules from GitHub
|
||||
- Parses `.conf` files for regex patterns
|
||||
- Extracts rule metadata (ID, severity, category)
|
||||
- Outputs structured JSON for conversion scripts
|
||||
### `owasp2json.py`
|
||||
|
||||
Fetches the OWASP Core Rule Set from GitHub and emits a flat JSON rule list.
|
||||
|
||||
```bash
|
||||
python owasp2json.py --ref v4.0 --output owasp_rules.json
|
||||
```
|
||||
|
||||
| Argument / env | Default | Purpose |
|
||||
|----------------|---------|---------|
|
||||
| `--output` | `owasp_rules.json` | Output JSON path |
|
||||
| `--ref` | `v4.0` | Tag prefix to resolve (e.g. `v4.0`, `v3.3`, `dev`) |
|
||||
| `--dry-run` | off | Fetch and parse without writing |
|
||||
| `GITHUB_TOKEN` (env) | unset | Raises the GitHub API rate limit while iterating |
|
||||
|
||||
The script verifies each blob's SHA against the GitHub-reported value before parsing it.
|
||||
|
||||
---
|
||||
|
||||
### json2nginx.py
|
||||
### `json2nginx.py`
|
||||
|
||||
Converts OWASP JSON rules to Nginx WAF configuration.
|
||||
Converts `owasp_rules.json` into Nginx `map`-based rules.
|
||||
|
||||
```bash
|
||||
python json2nginx.py
|
||||
INPUT_FILE=custom.json OUTPUT_DIR=/tmp/out python json2nginx.py
|
||||
```
|
||||
|
||||
**Input**: `owasp_rules.json`
|
||||
**Output**: `waf_patterns/nginx/`
|
||||
**Generated files** (in `OUTPUT_DIR`):
|
||||
|
||||
**Generated Files**:
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `waf_maps.conf` | Map directives (http block) |
|
||||
| `waf_rules.conf` | If statements (server block) |
|
||||
| `README.md` | Integration instructions |
|
||||
| `waf_maps.conf` | `map` directives — include in the `http` block |
|
||||
| `waf_rules.conf` | `if` rules — include in the `server` block |
|
||||
| `<category>.conf` | One file per OWASP category, **for inspection only** |
|
||||
| `README.md` | In-tree usage notes |
|
||||
|
||||
**Environment Variables**:
|
||||
- `INPUT_FILE` - Path to OWASP JSON (default: `owasp_rules.json`)
|
||||
- `OUTPUT_DIR` - Output directory (default: `waf_patterns/nginx`)
|
||||
| Env var | Default |
|
||||
|---------|---------|
|
||||
| `INPUT_FILE` | `owasp_rules.json` |
|
||||
| `OUTPUT_DIR` | `waf_patterns/nginx` |
|
||||
|
||||
---
|
||||
|
||||
### json2apache.py
|
||||
### `json2apache.py`
|
||||
|
||||
Converts OWASP JSON rules to Apache ModSecurity format.
|
||||
Converts `owasp_rules.json` into ModSecurity `SecRule` directives, partitioned by attack category.
|
||||
|
||||
```bash
|
||||
python json2apache.py
|
||||
```
|
||||
|
||||
**Input**: `owasp_rules.json`
|
||||
**Output**: `waf_patterns/apache/`
|
||||
**Generated files**: one `<category>.conf` per OWASP category (`sqli.conf`, `xss.conf`, `rce.conf`, `lfi.conf`, …) — each contains pure ModSecurity rules ready to `Include`.
|
||||
|
||||
**Generated Files**:
|
||||
- Category-specific `.conf` files (sqli.conf, xss.conf, etc.)
|
||||
- Each file contains ModSecurity `SecRule` directives
|
||||
| Env var | Default |
|
||||
|---------|---------|
|
||||
| `INPUT_FILE` | `owasp_rules.json` |
|
||||
| `OUTPUT_DIR` | `waf_patterns/apache` |
|
||||
|
||||
---
|
||||
|
||||
### json2traefik.py
|
||||
### `json2traefik.py`
|
||||
|
||||
Converts OWASP JSON rules to Traefik middleware configuration.
|
||||
Converts `owasp_rules.json` into a Traefik file-provider middleware.
|
||||
|
||||
```bash
|
||||
python json2traefik.py
|
||||
```
|
||||
|
||||
**Input**: `owasp_rules.json`
|
||||
**Output**: `waf_patterns/traefik/`
|
||||
**Generated files**:
|
||||
|
||||
**Generated Files**:
|
||||
- `middleware.toml` - Traefik middleware configuration
|
||||
- `README.md` - Integration instructions
|
||||
- `middleware.toml` — complete WAF middleware definition
|
||||
- `README.md` — in-tree integration notes
|
||||
|
||||
| Env var | Default |
|
||||
|---------|---------|
|
||||
| `INPUT_FILE` | `owasp_rules.json` |
|
||||
| `OUTPUT_DIR` | `waf_patterns/traefik` |
|
||||
|
||||
---
|
||||
|
||||
### json2haproxy.py
|
||||
### `json2haproxy.py`
|
||||
|
||||
Converts OWASP JSON rules to HAProxy ACL format.
|
||||
Converts `owasp_rules.json` into HAProxy ACL files.
|
||||
|
||||
```bash
|
||||
python json2haproxy.py
|
||||
```
|
||||
|
||||
**Input**: `owasp_rules.json`
|
||||
**Output**: `waf_patterns/haproxy/`
|
||||
**Generated files**:
|
||||
|
||||
**Generated Files**:
|
||||
- `waf.acl` - Main WAF ACL rules
|
||||
- `README.md` - Integration instructions
|
||||
- `waf.acl` — one regex per line, designed for `-f /etc/haproxy/waf.acl`
|
||||
- `README.md` — in-tree integration notes
|
||||
|
||||
| Env var | Default |
|
||||
|---------|---------|
|
||||
| `INPUT_FILE` | `owasp_rules.json` |
|
||||
| `OUTPUT_DIR` | `waf_patterns/haproxy/` |
|
||||
|
||||
---
|
||||
|
||||
### badbots.py
|
||||
### `badbots.py`
|
||||
|
||||
Generates bad bot blocking configurations from public bot lists.
|
||||
Independently fetches public bad-bot User-Agent lists and emits a `bots.*` file in each platform output directory.
|
||||
|
||||
```bash
|
||||
python badbots.py
|
||||
```
|
||||
|
||||
**Output**: Bot configurations in each `waf_patterns/*/` directory
|
||||
**Generated files** (per platform):
|
||||
|
||||
**Features**:
|
||||
- Fetches from multiple public bot lists
|
||||
- Includes fallback sources for reliability
|
||||
- Generates platform-specific configs
|
||||
| Platform | File |
|
||||
|----------|------|
|
||||
| Nginx | `waf_patterns/nginx/bots.conf` |
|
||||
| Apache | `waf_patterns/apache/bots.conf` |
|
||||
| Traefik | `waf_patterns/traefik/bots.toml` |
|
||||
| HAProxy | `waf_patterns/haproxy/bots.acl` |
|
||||
|
||||
---
|
||||
| Env var | Purpose |
|
||||
|---------|---------|
|
||||
| `GITHUB_TOKEN` | Raises the GitHub API rate limit when fetching upstream lists |
|
||||
|
||||
## Import Scripts
|
||||
If a remote source is unreachable, the script falls back to a bundled list.
|
||||
|
||||
These scripts help import existing WAF configurations.
|
||||
## Import / install scripts
|
||||
|
||||
### import_nginx_waf.py
|
||||
The `import_*.py` scripts copy generated files into a server's runtime configuration directory and (optionally) splice an `Include` line into the main config. They are configured **entirely** through environment variables.
|
||||
|
||||
Import Nginx WAF patterns from external sources.
|
||||
### `import_nginx_waf.py`
|
||||
|
||||
```bash
|
||||
python import_nginx_waf.py --source /path/to/external/rules
|
||||
```
|
||||
| Env var | Default |
|
||||
|---------|---------|
|
||||
| `WAF_DIR` | `waf_patterns/nginx` |
|
||||
| `NGINX_WAF_DIR` | `/etc/nginx/waf/` |
|
||||
| `NGINX_CONF` | `/etc/nginx/nginx.conf` |
|
||||
| `BACKUP_DIR` | `/etc/nginx/waf_backup/` |
|
||||
|
||||
### import_apache_waf.py
|
||||
### `import_apache_waf.py`
|
||||
|
||||
Import Apache ModSecurity rules.
|
||||
| Env var | Default |
|
||||
|---------|---------|
|
||||
| `WAF_DIR` | `waf_patterns/apache` |
|
||||
| `APACHE_WAF_DIR` | `/etc/modsecurity.d/` |
|
||||
| `APACHE_CONF` | `/etc/apache2/apache2.conf` |
|
||||
| `BACKUP_DIR` | `/etc/modsecurity.d/backup` |
|
||||
|
||||
```bash
|
||||
python import_apache_waf.py --source /path/to/modsec/rules
|
||||
```
|
||||
### `import_traefik_waf.py`
|
||||
|
||||
### import_traefik_waf.py
|
||||
| Env var | Default |
|
||||
|---------|---------|
|
||||
| `WAF_DIR` | `waf_patterns/traefik` |
|
||||
| `TRAEFIK_WAF_DIR` | `/etc/traefik/waf/` |
|
||||
| `TRAEFIK_DYNAMIC_CONF` | `/etc/traefik/dynamic.toml` |
|
||||
| `BACKUP_DIR` | `/etc/traefik/waf_backup/` |
|
||||
|
||||
Import Traefik middleware configurations.
|
||||
### `import_haproxy_waf.py`
|
||||
|
||||
```bash
|
||||
python import_traefik_waf.py --source /path/to/traefik/config
|
||||
```
|
||||
| Env var | Default |
|
||||
|---------|---------|
|
||||
| `WAF_DIR` | `waf_patterns/haproxy` |
|
||||
| `HAPROXY_WAF_DIR` | `/etc/haproxy/waf/` |
|
||||
| `HAPROXY_CONF` | `/etc/haproxy/haproxy.cfg` |
|
||||
| `BACKUP_DIR` | `/etc/haproxy/waf_backup/` |
|
||||
|
||||
### import_haproxy_waf.py
|
||||
::: warning Privileged paths
|
||||
The defaults point at system directories (`/etc/...`). Run the import scripts as root, or override every env var to point at a sandbox before running them.
|
||||
:::
|
||||
|
||||
Import HAProxy ACL rules.
|
||||
## Data format
|
||||
|
||||
```bash
|
||||
python import_haproxy_waf.py --source /path/to/haproxy/acl
|
||||
```
|
||||
### `owasp_rules.json`
|
||||
|
||||
---
|
||||
|
||||
## Data Structures
|
||||
|
||||
### owasp_rules.json Format
|
||||
A flat JSON array. Each item is a single rule with two required fields:
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"id": "942100",
|
||||
"pattern": "(?i:union.*select)",
|
||||
"category": "sqli",
|
||||
"severity": "critical",
|
||||
"location": "request-uri",
|
||||
"description": "SQL Injection Attack Detected"
|
||||
"category": "SQLI",
|
||||
"pattern": "(?i:union[\\s\\S]+select)"
|
||||
},
|
||||
{
|
||||
"category": "XSS",
|
||||
"pattern": "(?i:<script[^>]*>)"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
**Fields**:
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `id` | string | OWASP CRS rule ID |
|
||||
| `pattern` | string | Regex pattern |
|
||||
| `category` | string | Attack category (sqli, xss, rce, etc.) |
|
||||
| `severity` | string | critical, high, medium, low |
|
||||
| `location` | string | Where to match (request-uri, headers, etc.) |
|
||||
| `description` | string | Human-readable description |
|
||||
| `category` | string | OWASP CRS category derived from the source filename (e.g. `SQLI`, `XSS`, `RCE`, `LFI`, `RFI`, `BOTS`) |
|
||||
| `pattern` | string | Regex extracted from the matching `SecRule` directive |
|
||||
|
||||
---
|
||||
The converters validate each pattern with Python's `re.compile` before emitting platform-specific output, so malformed regexes are dropped rather than propagated.
|
||||
|
||||
## Extending the Project
|
||||
## Extending the toolchain
|
||||
|
||||
### Adding a New Platform
|
||||
### Adding a new platform
|
||||
|
||||
1. Create `json2<platform>.py` based on existing converters
|
||||
2. Add output directory in `waf_patterns/<platform>/`
|
||||
3. Update GitHub Actions workflow
|
||||
4. Add documentation in `docs/`
|
||||
1. Copy one of the existing `json2<platform>.py` converters as a starting point.
|
||||
2. Implement `_sanitize_pattern()` for the target syntax (escape rules differ between Nginx, Apache, HAProxy, …).
|
||||
3. Emit your output under `waf_patterns/<platform>/`.
|
||||
4. Add a workflow step in `.github/workflows/update_patterns.yml` to package the result.
|
||||
5. Add a documentation page under `docs/`.
|
||||
|
||||
### Custom Pattern Sources
|
||||
### Pinning a different OWASP CRS version
|
||||
|
||||
Modify `owasp2json.py` to add new pattern sources:
|
||||
|
||||
```python
|
||||
SOURCES = [
|
||||
"coreruleset/coreruleset",
|
||||
"your-org/your-rules",
|
||||
]
|
||||
```bash
|
||||
python owasp2json.py --ref v3.3
|
||||
```
|
||||
|
||||
---
|
||||
### Pulling rules from a fork
|
||||
|
||||
`owasp2json.py` hardcodes the upstream repository constant `coreruleset/coreruleset`. To target a fork, edit `GITHUB_REPO_URL` near the top of the script.
|
||||
|
||||
## Dependencies
|
||||
|
||||
Listed in `requirements.txt`:
|
||||
|
||||
```
|
||||
requests>=2.28.0
|
||||
beautifulsoup4>=4.11.0
|
||||
```
|
||||
|
||||
Install with:
|
||||
Listed in [`requirements.txt`](https://github.com/fabriziosalmi/patterns/blob/main/requirements.txt). Install with:
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
The pipeline targets Python **3.11+**.
|
||||
|
||||
Reference in New Issue
Block a user