The Vibe Coding Cascade: AI-Generated Code Ships Faster Than Organizations Can Catch It Breaking

The Insight

AI coding tools have entered a new phase. They generate code that compiles, passes tests, and ships to production — while embedding vulnerabilities, logic errors, and hallucinated dependencies at a rate that traditional quality systems were never designed to detect. The result is not occasional bugs. It is a structural failure mode: correct code with drifted intent, deployed at scale, faster than any review process can intercept.[1][2]

The evidence arrived from multiple independent sources within weeks of each other. Georgia Tech’s Systems Software & Security Lab launched the Vibe Security Radar, tracking CVEs directly attributable to AI-generated code. Their data shows an exponential curve: 6 confirmed AI CVEs in January 2026, 15 in February, 35 in March. Of the 74 total, 27 were authored by Claude Code, 4 by GitHub Copilot, 2 by Devin. The researchers estimate the true count is 5–10× higher — 400 to 700 cases across the open-source ecosystem — because most AI tool signatures are stripped before commit.[1][3]

The Promise

90% of developers use AI. PRs per author up 20%. 10× code velocity. Ship faster. Releases accelerated 75%.

The Reality

1.7× more bugs. 2.74× more XSS vulns. 28% hallucinated packages. Incidents per PR up 23.5%. Change failure rates up 30%.

Simultaneously, CodeRabbit analysed 470 open-source GitHub pull requests and found AI-generated code produces 1.7× more issues overall, with 75% more logic errors, 1.5–2× more security vulnerabilities, and 8× more performance inefficiencies compared to human-authored code. The problems are not just more frequent — they are more severe: AI-authored PRs contain 1.4× more critical issues and 1.7× more major issues.[2][4]

Then Sonatype published the 2026 State of the Software Supply Chain report. Analysing nearly 37,000 real dependency upgrade recommendations across Maven, npm, PyPI, and NuGet, they found that 27.8% of AI-generated dependency recommendations were hallucinations — versions that do not exist in any live repository. Worse, some recommendations pointed to confirmed protestware and packages compromised in known supply chain attacks.[5][6]

8×

Performance Inefficiencies in AI Code

CodeRabbit found excessive I/O operations were approximately 8× more common in AI-authored pull requests, reflecting AI’s tendency to favour clarity and simple patterns over resource efficiency. The code works. It works slowly, expensively, and at scale — those costs compound.

The Evidence Cascade: 2025–2026

May 2025

Georgia Tech Launches Vibe Security Radar

The SSLab begins systematically tracking CVEs attributable to AI-generated code by tracing commit histories and co-author metadata across public vulnerability databases.[1]

D5 Quality Measurement Begins

Dec 2025

CodeRabbit: 1.7× More Bugs in AI Code

Analysis of 470 GitHub PRs reveals AI-generated code introduces significantly more defects across logic, security, maintainability, and performance categories. AI-authored PRs average 10.83 issues versus 6.45 for human PRs.[2]

D5 Empirical Confirmation

Dec 2025

Tenzai: 69 Vulnerabilities Across 5 AI Tools

Testing of 15 applications built by five major vibe coding tools uncovers 69 vulnerabilities, including six critical. Every single application lacked CSRF protection. Every tool introduced SSRF vulnerabilities.[7]

D5 + D6 Systemic Weakness

Jan 2026

Cortex: Incidents Per PR Up 23.5%

The Engineering in the Age of AI benchmark report surveys 50+ engineering leaders and finds that while PRs per author increased 20%, incidents per PR jumped 23.5% and change failure rates rose approximately 30%. Only 32% of organisations have formal AI governance policies.[8]

D6 Operational Degradation

Jan 2026

Sonatype: 28% Hallucinated Dependencies

The 2026 State of the Software Supply Chain report analyses 37,000 dependency recommendations and finds 27.8% reference non-existent package versions. Some recommendations include confirmed protestware and supply chain attack vectors. Over 1.2 million malicious packages detected in open-source registries.[5]

D6 Supply Chain Contamination

Jan 2026

Palo Alto Unit 42: SHIELD Framework

Palo Alto Networks launches a governance framework specifically designed for vibe coding security, acknowledging the emergence of a new threat category requiring dedicated controls.[9]

D4 Governance Framework

Feb 2026

Moltbook Breach: 1.5M API Keys Exposed

A social networking platform built entirely through vibe coding suffers a major data breach. Security firm Wiz discovers a misconfigured database with public read/write access. The founder stated he had not written a single line of code manually.[10]

D1 Customer Data Breach

Mar 2026

Georgia Tech: 35 AI CVEs in One Month

The Vibe Security Radar records its highest monthly count. 74 total confirmed AI CVEs since tracking began. Researchers estimate 400–700 real cases across the ecosystem. The exponential curve continues: 6 → 15 → 35.[1][3]

D5 Accelerating Exposure

Mar 24, 2026

NCSC: Vibe Coding Governance Call at RSA

UK National Cyber Security Centre CEO Dr. Richard Horne delivers keynote at RSA Conference calling for immediate vibe coding safeguards. NCSC publishes blog warning AI-generated code poses “intolerable risks.” The cybersecurity establishment now treats AI code quality as a national security concern.[11]

D4 National Security Response

Mar 2026

Escape.tech: 2,000+ Vulns in 5,600 Vibe-Coded Apps

Automated scanning of thousands of applications built through AI coding tools reveals widespread vulnerability patterns. The volume confirms that this is not an edge case problem — it is the default condition of AI-generated production code.[12]

D5 Ecosystem-Wide Pattern

The Structural Failure

The vibe coding cascade is not a story about bad AI. It is a story about a new failure mode that traditional quality systems were not designed to detect. AI-generated code optimises for local correctness — the function works, the test passes, the build succeeds — while the semantic intent drifts. The code does what it says, but not what was meant. At the scale AI now operates, this gap between syntactic correctness and semantic intent is where vulnerabilities, logic errors, and supply chain contamination originate.

Bug Amplification

1.7×

AI-generated PRs contain 1.7× more issues, with 75% more logic errors, 2.74× more XSS vulnerabilities, and 1.88× more improper password handling than human code.[2]

CVE Acceleration

6→35

Monthly AI-attributed CVEs grew from 6 in January to 35 in March 2026. Researchers estimate 400–700 real cases exist across the ecosystem, with most AI signatures stripped.[1]

Supply Chain Poison

27.8%

Nearly one in four AI dependency recommendations are hallucinations. Some point to confirmed malware, protestware, and compromised packages. 1.2 million malicious packages detected in registries.[5]

Incident Surge

+23.5%

Incidents per PR jumped 23.5% while PRs per author rose 20%. Change failure rates up 30%. More output, more damage per unit of output. Only 32% of organisations have formal AI governance.[8]

Startup Remediation

$4B

An estimated 8,000+ startups need $50K–$500K rebuilds after building production apps with AI assistants. Total estimated remediation: $400M–$4B. Rescue engineering is emerging as a new discipline.[13]

Code Churn

2×

GitClear analysis of 211 million changed lines found code churn (written then reverted within two weeks) nearly doubled between 2020 and 2024, correlating with AI tool adoption. Copy-pasted code rose from 8.3% to 12.3%.[14]

The attractions of vibe coding are clear. Disrupting the status quo of manually produced software that is consistently vulnerable is a huge opportunity, but not without risk of its own.
— Dr. Richard Horne, CEO, UK National Cyber Security Centre, RSA Conference, March 24, 2026[11]

The 6D Diagnostic Cascade

The cascade originates from Quality (D5) — AI-generated code quality failures at scale — and propagates through Operational (D6, supply chain contamination and pipeline inadequacy), Revenue (D3, remediation costs), Customer (D1, trust erosion and data breaches), Employee (D2, skill atrophy and review fatigue), and Regulatory (D4, emerging governance frameworks). All six dimensions are activated. The volume of AI-generated code outpaces the capacity to review it meaningfully.

Dimension	Score	Diagnostic Evidence
Quality (D5)Origin — 72	72	1.7× more bugs. 35 CVEs in one month. 8× performance issues. AI-generated code introduces more defects across every major quality category. 74 confirmed CVEs directly attributed to AI code, with estimated 400–700 real cases. 75% more logic errors. 2.74× more XSS vulnerabilities. AI dependency recommendations hallucinate 28% of the time. 69 vulnerabilities across 15 apps built by 5 major AI tools. Every app lacked CSRF protection.[1][2][5] Semantic Intent Drift
Operational (D6)L1 — 70	70	28% hallucinated dependencies. 1.2M malicious packages. CI/CD not designed for this volume. Supply chain contamination via AI-recommended packages that do not exist or contain malware. 454,648 new malicious packages detected in open-source registries in 2025. Change failure rates up 30%. Pipelines built for human-speed review cannot absorb AI-speed code generation. Code churn doubled.[5][6][8] Supply Chain Contamination
Revenue (D3)L1 — 68	68	$400M–$4B remediation. 8,000+ startup rebuilds needed. Startups that built production apps with AI assistants face $50K–$500K rebuild costs each. First-year maintenance costs estimated at 12% above traditional development. Code that looks finished but cannot support real usage creates a technical debt time bomb. Rescue engineering is emerging as a new discipline.[13][14] Remediation Wave
Customer (D1)L1 — 65	65	Trust erosion accelerating. Developer confidence declining. Stack Overflow’s 2025 survey found only 29% of developers trust the accuracy of AI-generated code, down from 40% previously. Moltbook breach exposed 1.5M API keys and 35,000 emails from a fully vibe-coded platform. End users bear the cost of invisible quality drift they have no visibility into.[10][14] Trust Erosion
Employee (D2)L2 — 60	60	Reviewer fatigue at scale. AI-generated code looks correct, compiles cleanly, and passes superficial checks while hiding subtle logical errors. Median PR size increased 33% in 2025. Engineers shifting from writing to reviewing, but review capacity has not scaled. Refactoring dropped from 25% to under 10% of changed lines. Skill atrophy as developers accept AI output without comprehending functionality.[8][14] Review Fatigue & Skill Atrophy
Regulatory (D4)L2 — 55	55	NCSC CEO calling for vibe coding safeguards at RSA Conference. Palo Alto SHIELD framework launched. NCSC blog warns AI-generated code poses “intolerable risks.” EU Cyber Resilience Act and AI Act converging on proof of provenance. Cambridge/MIT AI Agent Index found only 4 developers publish safety documentation covering autonomy levels. Governance arriving reactively, not proactively.[9][11] Emerging Governance

6/6

Dimensions Hit

10×–15×

Multiplier (Extreme)

2,860

FETCH Score

FETCH Score Breakdown

Chirp (avg cascade score across 6D): (72 + 70 + 68 + 65 + 60 + 55) / 6 = 65.0

|DRIFT| (methodology - performance): |85 - 35| = 50 — Default DRIFT. Secure development practices exist: code review, SAST/DAST, dependency pinning, least-privilege, deployment gates. They are well-understood. They are simply not being applied to AI-generated code at the speed AI generates it.

Confidence: 0.88 — Multiple independent empirical studies (Georgia Tech CVE tracking, CodeRabbit 470-PR analysis, Sonatype 37K dependency study, Cortex 50+ engineering leaders, Tenzai 15-app assessment), government-level response (NCSC RSA keynote), and industry governance frameworks (Palo Alto SHIELD). Hard numbers from tracked CVEs and analysed PRs.

FETCH = 65.0 × 50 × 0.88 = 2,860 -> EXECUTE — HIGH PRIORITY (threshold: 1,000)

OriginD5 Quality

L1D6 Operational+D3 Revenue+D1 Customer

L2D2 Employee+D4 Regulatory

CAL SourceCascade Analysis Language — software engineering diagnostic

-- The Vibe Coding Cascade: Software Engineering Diagnostic
-- Sense -> Analyze -> Measure -> Decide -> Act

FORAGE ai_code_quality_drift
WHERE ai_cve_monthly_count > 30
  AND ai_bug_multiplier > 1.5
  AND dependency_hallucination_pct > 25
  AND incidents_per_pr_delta > 20
  AND startup_rebuild_count > 5000
  AND national_security_response = true
ACROSS D5, D6, D3, D1, D2, D4
DEPTH 3
SURFACE vibe_coding_cascade

DIVE INTO semantic_intent_drift
WHEN code_correctness = true  -- compiles, passes tests
  AND semantic_intent_preserved = false  -- but intent has drifted
  AND review_capacity_exceeded = true  -- volume outpaces review
  AND supply_chain_contaminated = true  -- hallucinated packages in production
TRACE vibe_coding_cascade  -- D5 -> D6+D3+D1 -> D2+D4
EMIT quality_cascade_at_scale

DRIFT vibe_coding_cascade
METHODOLOGY 85  -- SAST/DAST, code review, dependency pinning, deployment gates all exist
PERFORMANCE 35  -- 28% hallucinated deps, 32% have governance, review collapsing under volume

FETCH vibe_coding_cascade
THRESHOLD 1000
ON EXECUTE CHIRP critical "6/6 dimensions, semantic intent drift at scale, CVEs tripling monthly"

SURFACE analysis AS json

SENSEOrigin: D5 (AI code quality failures at scale). 74 confirmed CVEs from AI-generated code, tripling monthly. 1.7× more bugs across 470 analysed PRs. 28% hallucinated dependencies across 37,000 recommendations. 69 vulnerabilities in 15 apps built by 5 AI tools. 2,000+ vulnerabilities found across 5,600 vibe-coded applications. The pattern is not anecdotal — it is empirical, multi-source, and accelerating.

ANALYZED5→D6: hallucinated dependencies contaminate supply chains, 1.2M malicious packages in registries, CI/CD pipelines not designed for AI-speed code volume. D5→D3: $400M–$4B remediation wave, 8,000+ startup rebuilds. D5→D1: trust erosion, developer confidence dropping, end-user data breaches from vibe-coded apps. D1→D2: reviewer fatigue, skill atrophy, median PR size up 33%, refactoring dropped from 25% to <10%. D3→D4: NCSC governance calls, Palo Alto SHIELD, EU CRA/AI Act convergence. Cross-references: UC-082 (Guardrail Gap), UC-083 (Toxic Flow), UC-084 (250 Billion Lines).

MEASUREDRIFT = 50 (default). Secure development methodology exists and is well-understood: SAST/DAST scanning, dependency pinning, least-privilege access, code review, staged rollouts, SBOM generation. These practices simply are not being applied to AI-generated code at AI generation speed. The gap is adoption and velocity, not knowledge. The Semantic Intent pattern (semanticintent.dev) has formally identified the D5 origin — the decoupling of behavioral intent from code implementation.

DECIDEFETCH = 2,860 → EXECUTE — HIGH PRIORITY (threshold: 1,000). Calibrated against UC-082 (Guardrail Gap, FETCH 2,603) as the escalation case with dramatically more empirical evidence.

ACTCascade alert — software engineering diagnostic. The insight is that AI-generated code creates a new failure mode: locally correct code with drifted intent. Traditional quality systems test whether code does what it says. They do not test whether it does what was meant. The cascade accelerates because the volume of AI-generated code outpaces the capacity to review it meaningfully, and the supply chain is contaminated by hallucinated dependencies that no human selected. The NCSC response confirms this has reached national security significance.

Runtime: @stratiqx/cal-runtime · Spec: cal.cormorantforaging.dev · DOI: 10.5281/zenodo.18905193

Key Insights

The Correct Bug

The defining characteristic of this cascade is code that is syntactically correct but semantically wrong. It compiles. It passes tests. It ships. And it contains 1.7× more bugs than human-written code because AI optimises for pattern completion, not intent preservation. Traditional quality gates test whether code does what it says. They do not test whether it does what was meant. This is a new failure mode, and it requires new quality systems to detect.

Supply Chain as Attack Surface

When 28% of AI-recommended dependencies are hallucinations — including confirmed malware and protestware — the software supply chain is being contaminated at the point of creation, not the point of attack. Sonatype found that AI models confidently recommend packages that do not exist, enabling attackers who register those names. The attack vector is no longer exploitation of existing code. It is the generation of new code that references phantom dependencies.

The Review Capacity Collapse

PRs per author up 20%. Median PR size up 33%. Incidents per PR up 23.5%. The math does not work. AI generates code faster than humans can meaningfully review it. And the code it generates is harder to review: it looks correct, compiles cleanly, and hides subtle errors that surface only under specific conditions. The human quality gate that prevented production failures for decades is being overwhelmed by volume, not bypassed by malice.

The Semantic Intent Gap

At the centre of this cascade is a structural problem that has been formally identified: the decoupling of behavioral intent from code implementation. AI tools translate natural language descriptions into code, but the translation is lossy. Each iteration compounds the drift. The Semantic Intent pattern (semanticintent.dev) addresses this origin directly, proposing structured intent preservation as a first-class engineering concern — evidence that the problem has been formally named and solutions are emerging.

Sources

Tier 1 — Primary Research & CVE Tracking

[1]

Infosecurity Magazine — Researchers Sound the Alarm on Vulnerabilities in AI-Generated Code. Georgia Tech Vibe Security Radar: 74 confirmed AI CVEs (6 Jan → 15 Feb → 35 Mar). Estimated 400–700 real cases. Claude Code most visible due to metadata signatures.
infosecurity-magazine.com
March 26, 2026

[2]

CodeRabbit / BusinessWire — State of AI vs Human Code Generation Report. 470 PRs analysed. AI code: 1.7× more issues, 75% more logic errors, 1.5–2× security vulnerabilities, 8× performance inefficiencies. 1.4× more critical issues.
businesswire.com
December 17, 2025

[3]

The Register — Using AI to code does not mean your code is more secure. Georgia Tech SSLab detail: 27 of 35 March CVEs from Claude Code, 4 from Copilot, 2 from Devin. 30.7 billion lines added by Claude Code in 90 days.
theregister.com
March 26, 2026

Tier 2 — Supply Chain & Operational Research

[4]

The Register — AI-authored code needs more attention, contains worse bugs. CodeRabbit breakdown: 1.75× logic errors, 1.64× maintainability issues, 1.57× security findings, 2.74× XSS vulnerabilities, 1.82× insecure deserialization.
theregister.com
December 17, 2025

[5]

Sonatype — 2026 State of the Software Supply Chain Report. 36,870 dependency recommendations analysed. 27.8% hallucination rate. Recommendations included confirmed protestware and compromised supply chain packages. 1.2M+ malicious packages detected.
sonatype.com
January 28, 2026

[6]

Infosecurity Magazine — Researchers Uncover 454,000+ Malicious Open Source Packages. Sonatype: 9.8 trillion downloads across four largest registries. 65% of new CVEs lack CVSS scores. AI amplifying both productivity and risk at machine scale.
infosecurity-magazine.com
January 28, 2026

Tier 3 — Engineering Benchmarks & Industry Analysis

[7]

Ravenna AI / Tenzai — Vibe Coding Security Risks in IT Workflows. 69 vulnerabilities across 15 applications built by 5 AI coding tools. 6 critical. 100% of tested apps lacked CSRF protection. 41% of code in 2024 was AI-generated.
ravenna.ai
February 3, 2026

[8]

Cortex — Engineering in the Age of AI: 2026 Benchmark Report. 50+ engineering leaders surveyed. PRs per author +20%. Incidents per PR +23.5%. Change failure rates +30%. Only 32% have formal AI governance. 90% of leaders say teams actively use AI tools.
cortex.io
2026

[9]

Crackr AI — Vibe Coding Failures: Documented AI Code Incidents. Comprehensive incident database: IDEsaster research found 30+ flaws across every major AI IDE resulting in 24 CVEs. Palo Alto SHIELD framework, Escape.tech data, Veracode 45% vulnerability rate.
crackr.dev
Updated March 2026

[10]

Daily.dev / Sainam Tech — Vibe Coding in 2026 / Vibe Coding Security Risks. Moltbook breach: 1.5 million API keys, 35,000 emails exposed. Misconfigured Supabase database with public read/write access. Founder wrote zero lines of code manually.
daily.dev
March 2026

[11]

The Cyber Express — NCSC Urges Vibe Coding Safeguards for AI Security 2026. Dr. Richard Horne keynote at RSA Conference, March 24. NCSC blog: AI-generated code poses “intolerable risks.” Calls for vibe coding safeguards and secure-by-design AI tools.
thecyberexpress.com
March 25, 2026

[12]

CodeRabbit Blog — 2025 was the year of AI speed. 2026 will be the year of AI quality. AI code has 1.7× more issues. Up to 75% more logic and correctness issues in areas likely to contribute to downstream incidents. AI-generated code that “looked right” but didn’t feel trustworthy.
coderabbit.ai
January 28, 2026

[13]

Tech Startups — The Vibe Coding Delusion: Why Thousands of Startups Are Now Paying the Price. 10,000 startups built production apps with AI assistants. 8,000+ need rebuilds at $50K–$500K each. “Rescue engineering will be the hottest discipline in tech.”
techstartups.com
December 11, 2025

[14]

Medium / Andrean Georgiev — AI Made Us Faster, Did It Make Us Better? GitClear: code churn doubled 2020–2024 (3.1% to 5.7%). Copy-pasted code 8.3% to 12.3%. Refactoring dropped from 25% to under 10%. Stack Overflow: only 29% of developers trust AI code accuracy.
medium.com
March 2026

[15]

Wikipedia — Vibe coding. Collins English Dictionary Word of the Year 2025. “Vibe Coding Kills Open Source” academic paper (January 2026). Andrej Karpathy coined the term February 2025. 92% of US developers use AI tools daily. 41% of global code AI-generated.
wikipedia.org
Updated March 2026

The Insight

The Promise

The Reality

The Evidence Cascade: 2025–2026

Georgia Tech Launches Vibe Security Radar

CodeRabbit: 1.7× More Bugs in AI Code

Tenzai: 69 Vulnerabilities Across 5 AI Tools

Cortex: Incidents Per PR Up 23.5%

Sonatype: 28% Hallucinated Dependencies

Palo Alto Unit 42: SHIELD Framework

Moltbook Breach: 1.5M API Keys Exposed

Georgia Tech: 35 AI CVEs in One Month

NCSC: Vibe Coding Governance Call at RSA

Escape.tech: 2,000+ Vulns in 5,600 Vibe-Coded Apps

The Structural Failure

Bug Amplification

CVE Acceleration

Supply Chain Poison

Incident Surge

Startup Remediation

Code Churn

The 6D Diagnostic Cascade

FETCH Score Breakdown

Key Insights

The Correct Bug

Supply Chain as Attack Surface

The Review Capacity Collapse

The Semantic Intent Gap

Sources

The headline is the trigger. The cascade is the story.