Skip to the content.

aurelio-labs/semantic-router — security scan

Repository: aurelio-labs/semantic-router — 3.6k★, MIT, a semantic-routing library that uses embedding-based decisions to route LLM calls / function calls / agentic intents. Commit scanned: df9722ad75f6 (HEAD of main at scan time) Scan date: 2026-05-27 Disclosure status: Public courtesy issue filed on the semantic-router repo. Scope kept tight (one issue, two concrete clusters with single-PR fixes each) per the methodology refinement after the dstack rejection.

Summary

Severity Count
Critical 0
High 61
Medium 55
Low 0
Info 0 (filtered)

116 total findings. After curation: two concrete clusters worth a single coordinated cleanup each, and a short tail of false positives.

semantic-router is the cleanest “mid-popularity, two-person maintainer team” scan in the series so far — no eval/exec/pickle/exec.Command in the source, no workflow shell-injections, no Dockerfile-as-root, no tarfile.extractall traps. The entire actionable surface lives in two distinct categories: a 50-site SQL identifier-interpolation cluster concentrated in one file, and a 30+ advisory dependency-drift tail across two uv.lock files.

Top findings (curated)

1. 50× SQL identifier-interpolation in semantic_router/index/postgres.py

Tool: Semgrep (sqlalchemy-execute-raw-query ×28 + formatted-sql-query ×22 — same code lines, different rule families) Verdict: Same class as on six prior scans — but with the unusual property of being concentrated in a single file.

semantic_router/index/postgres.py is the Postgres-backed implementation of semantic-router’s vector-index interface. The whole file uses the text(f"... {identifier} ...") pattern for DDL and query construction with config-controlled identifiers (index name, embedding dimension, distance operator). At 50 cross-rule findings on the same call sites, this is the largest single-file cluster of this class in the series.

The defensible fix is the same SQLAlchemy quoted_name() / Identifier() pattern documented on Upsonic, PraisonAI, airweave, honcho, dstack, and pixeltable. With every call site living in one file, the most pragmatic refactor is a small helper at the top of postgres.py (e.g. _index_ident(self) and _distance_op(self)) that other call sites use — turning 50 mechanical fixes into ~5 helper-call rewrites.

The realistic exploit window today is the same as on every prior occurrence: the identifier comes from Pydantic-validated config, so SQL injection is gated until/unless a future change lets it come from an untrusted source. The change isn’t urgent; it’s a class-cleanup.

2. ~30 dependency advisories across uv.lock and .dagger/uv.lock

Tool: Trivy (high/medium confidence — named advisories) Verdict: Real — concentrated in the production lockfile, single coordinated uv lock --upgrade pass clears most of them.

The advisory tail by package:

Package Hits Class
aiohttp 8 Multiple advisories — historically aiohttp has had request-smuggling, request-parsing, and DoS classes
urllib3 6 Standard HTTP-client advisory pile
onnx 3 ML-runtime advisories
Pillow 3 Image-parsing CVE classes
pyasn1 2 ASN.1 parsing
transformers 2 Mostly deserialization/parsing in ML preprocessing
requests, python-dotenv, idna 2 each Standard transitive tail

The .dagger/uv.lock is the Dagger CI pipeline’s separate lockfile — same idna / urllib3 / requests shape but a smaller surface (CI tooling, not the runtime). Both lockfiles benefit from a coordinated refresh; the main uv.lock is the priority.

For aiohttp specifically, the 8 advisories are dominated by a handful of CVE classes that all clear with a single bump past the current pin. Same for urllib3 (6 advisories, single bump).

3-N. Out of scope / by-design

Finding Verdict
non-literal-import in semantic_router/{index/pinecone,routers/base}.py By design — plugin/index discovery

That’s it. semantic-router’s scan has exactly the same shape as the cleanest scans in the series (Giskard, semble, logfire) for everything except the two concrete clusters above — no eval/exec/pickle, no workflow injection, no Dockerfile findings, no agent shell-tool patterns, no secrets in source.

Patterns observed

A two-person maintainer team with a tight scope produces tight scans. semantic-router is a focused library: vector-index implementations + a routing layer + integration adapters. The scan reflects that focus — the only large clusters are in two specific subsystems (the Postgres backend, and the dependency tree). There’s no sprawl, no monorepo drift, no “is this code or example or fixture?” ambiguity. This is the cleanness shape we’d hope to find more often.

The Postgres-file SQL cluster is the cleanest argument yet for a per-subsystem helper refactor over per-site fixes. Prior occurrences (Upsonic, PraisonAI, honcho, pixeltable) had the f-string SQL pattern spread across multiple files, where per-site fixes made sense. Here it’s 50 sites in one file — a _index_ident(self) helper at the top of postgres.py plus identifier-validation in the index-name setter generalizes the fix in a single PR rather than 50. Worth flagging this pragmatic shape to the maintainers explicitly.

The dep-CVE tail is the kind of “schedule a uv lock --upgrade pass” item that monthly automation handles. Dependabot or Renovate scoped over both uv.lock files would catch this cadence; the same recommendation we made on Klavis, honcho, and dstack applies, just at smaller scale.

Notes on the tool

Disclosure timeline

Reproduce

git clone https://github.com/elfrost/ai-patchlab
cd ai-patchlab
pip install -e ".[dev]"
python scanner/run_scan.py \
  --from-git-url "https://github.com/aurelio-labs/semantic-router" \
  --reports-dir reports/aurelio-labs-semantic-router \
  --min-severity medium

External tools (Semgrep, Gitleaks, Trivy, pip-audit) need to be installed separately — see the project README.