Files
patterns/docs/api.md
Fabrizio Salmi 5c654b3da8 Redesign docs with Apple-native theme; verify content; route CI to self-hosted runner-02
- VitePress: custom theme (SF system fonts, glass nav, soft surfaces, pill buttons,
  light/dark code blocks, refined feature cards, platform showcase + stat strip).
- Replace every emoji across docs and README with inline SVG icons.
- Verify and fix doc accuracy against actual scripts: JSON schema (category+pattern only),
  env-var configuration for json2*/import_* scripts, owasp2json CLI surface.
- Add public assets (logo.svg, favicon.svg, hero-shield.svg) and Shiki haproxy alias.
- Workflows default to self-hosted runner-02 with a configurable fallback to GitHub
  runners via the RUNS_ON repo variable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 08:07:04 +02:00

6.6 KiB

API & Scripts Reference

Patterns is a small Python toolchain. Every script does one job and communicates with the rest through plain JSON or files on disk — no shared state, no daemon, no database.

owasp2json.py  ──▶  owasp_rules.json  ──▶  json2{nginx,apache,traefik,haproxy}.py
                                       └▶  badbots.py (independent)

All scripts are configured through environment variables (not CLI flags) except owasp2json.py, which has a small argparse interface.

Pipeline scripts

owasp2json.py

Fetches the OWASP Core Rule Set from GitHub and emits a flat JSON rule list.

python owasp2json.py --ref v4.0 --output owasp_rules.json
Argument / env Default Purpose
--output owasp_rules.json Output JSON path
--ref v4.0 Tag prefix to resolve (e.g. v4.0, v3.3, dev)
--dry-run off Fetch and parse without writing
GITHUB_TOKEN (env) unset Raises the GitHub API rate limit while iterating

The script verifies each blob's SHA against the GitHub-reported value before parsing it.


json2nginx.py

Converts owasp_rules.json into Nginx map-based rules.

python json2nginx.py
INPUT_FILE=custom.json OUTPUT_DIR=/tmp/out python json2nginx.py

Generated files (in OUTPUT_DIR):

File Purpose
waf_maps.conf map directives — include in the http block
waf_rules.conf if rules — include in the server block
<category>.conf One file per OWASP category, for inspection only
README.md In-tree usage notes
Env var Default
INPUT_FILE owasp_rules.json
OUTPUT_DIR waf_patterns/nginx

json2apache.py

Converts owasp_rules.json into ModSecurity SecRule directives, partitioned by attack category.

python json2apache.py

Generated files: one <category>.conf per OWASP category (sqli.conf, xss.conf, rce.conf, lfi.conf, …) — each contains pure ModSecurity rules ready to Include.

Env var Default
INPUT_FILE owasp_rules.json
OUTPUT_DIR waf_patterns/apache

json2traefik.py

Converts owasp_rules.json into a Traefik file-provider middleware.

python json2traefik.py

Generated files:

  • middleware.toml — complete WAF middleware definition
  • README.md — in-tree integration notes
Env var Default
INPUT_FILE owasp_rules.json
OUTPUT_DIR waf_patterns/traefik

json2haproxy.py

Converts owasp_rules.json into HAProxy ACL files.

python json2haproxy.py

Generated files:

  • waf.acl — one regex per line, designed for -f /etc/haproxy/waf.acl
  • README.md — in-tree integration notes
Env var Default
INPUT_FILE owasp_rules.json
OUTPUT_DIR waf_patterns/haproxy/

badbots.py

Independently fetches public bad-bot User-Agent lists and emits a bots.* file in each platform output directory.

python badbots.py

Generated files (per platform):

Platform File
Nginx waf_patterns/nginx/bots.conf
Apache waf_patterns/apache/bots.conf
Traefik waf_patterns/traefik/bots.toml
HAProxy waf_patterns/haproxy/bots.acl
Env var Purpose
GITHUB_TOKEN Raises the GitHub API rate limit when fetching upstream lists

If a remote source is unreachable, the script falls back to a bundled list.

Import / install scripts

The import_*.py scripts copy generated files into a server's runtime configuration directory and (optionally) splice an Include line into the main config. They are configured entirely through environment variables.

import_nginx_waf.py

Env var Default
WAF_DIR waf_patterns/nginx
NGINX_WAF_DIR /etc/nginx/waf/
NGINX_CONF /etc/nginx/nginx.conf
BACKUP_DIR /etc/nginx/waf_backup/

import_apache_waf.py

Env var Default
WAF_DIR waf_patterns/apache
APACHE_WAF_DIR /etc/modsecurity.d/
APACHE_CONF /etc/apache2/apache2.conf
BACKUP_DIR /etc/modsecurity.d/backup

import_traefik_waf.py

Env var Default
WAF_DIR waf_patterns/traefik
TRAEFIK_WAF_DIR /etc/traefik/waf/
TRAEFIK_DYNAMIC_CONF /etc/traefik/dynamic.toml
BACKUP_DIR /etc/traefik/waf_backup/

import_haproxy_waf.py

Env var Default
WAF_DIR waf_patterns/haproxy
HAPROXY_WAF_DIR /etc/haproxy/waf/
HAPROXY_CONF /etc/haproxy/haproxy.cfg
BACKUP_DIR /etc/haproxy/waf_backup/

::: warning Privileged paths The defaults point at system directories (/etc/...). Run the import scripts as root, or override every env var to point at a sandbox before running them. :::

Data format

owasp_rules.json

A flat JSON array. Each item is a single rule with two required fields:

[
  {
    "category": "SQLI",
    "pattern": "(?i:union[\\s\\S]+select)"
  },
  {
    "category": "XSS",
    "pattern": "(?i:<script[^>]*>)"
  }
]
Field Type Description
category string OWASP CRS category derived from the source filename (e.g. SQLI, XSS, RCE, LFI, RFI, BOTS)
pattern string Regex extracted from the matching SecRule directive

The converters validate each pattern with Python's re.compile before emitting platform-specific output, so malformed regexes are dropped rather than propagated.

Extending the toolchain

Adding a new platform

  1. Copy one of the existing json2<platform>.py converters as a starting point.
  2. Implement _sanitize_pattern() for the target syntax (escape rules differ between Nginx, Apache, HAProxy, …).
  3. Emit your output under waf_patterns/<platform>/.
  4. Add a workflow step in .github/workflows/update_patterns.yml to package the result.
  5. Add a documentation page under docs/.

Pinning a different OWASP CRS version

python owasp2json.py --ref v3.3

Pulling rules from a fork

owasp2json.py hardcodes the upstream repository constant coreruleset/coreruleset. To target a fork, edit GITHUB_REPO_URL near the top of the script.

Dependencies

Listed in requirements.txt. Install with:

pip install -r requirements.txt

The pipeline targets Python 3.11+.