[ Memory-corruption zero-days · Automated ]
Meet Lilith. Autonomous zero-day discovery.
Lilith is an autonomous engine that hunts memory-corruption vulnerabilities — buffer overflows, use-after-free, out-of-bounds access — in C/C++ infrastructure code. It orchestrates frontier LLMs across a 20-phase pipeline — threat modeling, adversarial exploration, ASAN-verified proof-of-concept — end-to-end, with no human in the loop. Stella runs Lilith against your codebase and delivers CVE-ready findings. 16 CVEs assigned across Firefox NSS, wolfSSL, Arm mbedTLS, and strongSwan.
16 CVEs · 4 published · $65K bounties
[01]
16
CVEs Assigned
4 published · 12 pending disclosure
[02]
57
Findings Accepted
[03]
40+
Targets Audited
[How Lilith Works]
From git URL to CVE-ready report. Automatically.
Lilith runs four stages end-to-end, with no human in the loop. Every stage is instrumented and gated — hallucinated findings are rejected before they reach you.
Scope
You share your target codebase. Stella agrees scope, timeline, and disclosure path — then hands the repo to Lilith.
Lilith explores
Parallel LLM explorers build a threat model, cross-reference the code against protocol specs (TLS, X.509, HPKE, and more), and generate adversarial attack hypotheses.
Lilith verifies
Every candidate is compiled with AddressSanitizer and reproduced on isolated GCP instances. An evidence gate rejects hallucinated stack traces — only machine-verified crashes survive.
Lilith reports
You receive CVE-ready write-ups — CWE classification, CVSS scoring, runnable proof-of-concept code, and responsible-disclosure guidance — generated autonomously.
[Why Lilith]
What traditional tools miss. What Lilith finds.
Fuzzers can't reason about specifications. Static analyzers can't hypothesize. Manual audits take months. Lilith reasons, hypothesizes, and verifies — in hours, autonomously.
Memory corruption, hunted by reasoning
Lilith specializes in memory-corruption vulnerabilities — buffer overflows, use-after-free, out-of-bounds access — in C/C++ infrastructure code. Parallel LLM explorers read RFCs, build threat models, and generate attack hypotheses that fuzzers need millions of inputs to stumble into.
Self-verifying proof-of-concept
Every finding is built with AddressSanitizer and reproduced on isolated GCP instances. Lilith's evidence gate rejects hallucinated stack traces automatically — no false positives reach your team.
End-to-end autonomy
From git URL to CVE-ready markdown: 20 phases run without human intervention. Explore, exploit, verify, report — all automatic, in hours rather than months. 16 CVEs already assigned across Firefox NSS, wolfSSL, Arm mbedTLS, and strongSwan.
[Proven Results]
Real vulnerabilities, real impact.
Four memory-corruption CVEs publicly disclosed on CVE.org, credited to Haruto Kimura (Stella). Twelve more are pending coordinated disclosure across Firefox NSS, Arm mbedTLS, strongSwan, and PowerDNS — bounties already paid by Mozilla, Arm, and others.
- MEDCVE-2026-3849·CVSS 6.9·wolfSSL/HPKE (ECH)
Stack buffer overflow
Stack overflow in wc_HpkeLabeledExtract via oversized ECH config. Malicious ECH configs overflow the client stack during Encrypted Client Hello. Patched by wolfSSL within hours.
- MEDCVE-2026-2646·CVSS 5.0·wolfSSL/Session Parsing
Heap buffer overflow
Heap overflow in wolfSSL_d2i_SSL_SESSION. Certificate and session ID lengths deserialized from untrusted input without bounds validation when SESSION_CERTS is enabled.
- LOWCVE-2026-4395·CVSS 1.3·wolfSSL/wolfcrypt ECC
Heap buffer overflow
Heap overflow in wc_ecc_import_x963_ex on the KCAPI path. A crafted oversized EC public key point writes attacker-controlled data past the pubkey_raw buffer.
- LOWCVE-2026-4159·CVSS 1.2·wolfSSL/PKCS#7
Out-of-bounds heap read
1-byte OOB heap read in wc_PKCS7_DecodeEnvelopedData triggered by CMS EnvelopedData with zero-length encrypted content. Affects wolfSSL 5.8.4 and earlier.
[Approach]
How a Stella audit compares.
Manual audits hand you a 50-page PDF. Stella delivers reproducible findings you can patch and verify on the spot.
[+]Stella Audit
- Cost
- Starts at $5K
- Time to Results
- 3–6 hours per target
- Proof Quality
- ASAN-verified stack traces
- Analysis Method
- Spec-driven LLM analysis
- Deliverable
- CVE-ready reports with PoC
Manual Audit
- Cost
- $50K–$150K per engagement
- Time to Results
- 3–6 months
- Proof Quality
- Varies by auditor
- Analysis Method
- Human intuition
- Deliverable
- PDF report