TL;DR
Google’s May 2026 whitepaper, “The New SDLC With Vibe Coding,” argues that software teams should focus less on model choice and more on the systems around AI coding agents. The paper claims generation is no longer the hard part; verification, judgment and direction now define the craft.
Google’s May 2026 whitepaper, The New SDLC With Vibe Coding, says the biggest change in software development is not a new programming language or cloud service, but a shift from writing code to directing AI systems that generate it, a claim with broad implications for teams already using coding agents in production workflows.
The paper, written by Addy Osmani, Shubham Saboo and Sokratis Kartakis, reports that as of early 2026, 85% of professional developers regularly use AI coding agents, 51% use them daily and about 41% of new code is AI-generated. Those figures are presented by the paper as evidence that AI-assisted development has moved from experiment to routine practice.
Its central argument is that the model itself is only a small part of agent performance. The paper describes a running agent as the combination of a model and a “harness”: prompts, tools, context policies, hooks, sandboxes, sub-agents, observability and the surrounding engineering process. According to the paper’s framing, the harness accounts for roughly 90% of the practical behavior teams experience.
The whitepaper also separates casual “vibe coding” from what it calls agentic engineering. In that framing, vibe coding means loose prompting and minimal review, while agentic engineering uses specifications, automated tests, evals, CI gates and human architectural oversight. The paper argues that without both tests and evals, teams may still be relying on fragile AI output even when the prompt appears sophisticated.
The model is only 10%
A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.
The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.
Verification Becomes The Bottleneck
The paper matters because many software teams are already spending heavily on model access while treating verification, context management and workflow design as secondary work. Google’s authors argue that this order is backward: once AI can produce large amounts of code quickly, the cost shifts to checking whether that code is correct, maintainable and secure.
That has budget and operating implications. The source material describes casual AI coding as low upfront cost but high later cost, with possible debt from repeated fix loops, messy generated code and security remediation. By contrast, agentic engineering requires higher setup costs for specs, evals and tooling, but is presented as cheaper over time if it raises first-pass success and routes simpler work to lower-cost models.

Coding with AI For Dummies (For Dummies: Learning Made Easy)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
From Vibes To Agentic Work
The whitepaper builds on the term “vibe coding,” associated in the source material with Andrej Karpathy’s February 2025 description of accepting AI-generated code and feeding errors back until something works. The Google paper treats that style as one point on a spectrum rather than a catch-all label for every use of AI in programming.
At one end are prototypes, disposable scripts and low-risk experiments. At the other are production systems where AI-generated work passes through tests, evals, CI/CD gates and review. The source material says the paper’s strongest practical line is that the difference is not whether AI is used, but how the output is verified.
The paper also cites benchmark evidence to support its harness argument. According to the source material, one team moved an agent from outside the Top 30 to the Top 5 on Terminal Bench 2.0 by changing the harness while keeping the same model. It also cites a LangChain experiment that improved an agent score by 13.7 points through changes to prompts, tools and middleware.
“Generation is solved; verification, judgment, and direction are the new craft.”
— Osmani, Saboo and Kartakis, according to the Google whitepaper

Automated Software Testing: From Zero to Secure Deploy: The Practical Guide to Mastering Jest, Cypress, TDD, and CI/CD to Eliminate Production Bugs and Boost Your Developer Career
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Claims Still Need Proof
Several parts of the argument remain claims rather than settled facts. The paper’s 10% model and 90% harness split is described as a rough framing, not a universal measurement. It is also not clear from the source material how consistently that ratio applies across languages, codebases, teams or safety-critical software.
The reported adoption figures, benchmark references and cost claims should be read as source-attributed figures unless independently verified. The source material also notes that while the concepts are broadly tool-agnostic, Google’s recommended paths point toward its own Gemini, Jules and Agent Development Kit ecosystem.

Java Software Verification Tools: Evaluation and Recommended Methodology
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Teams Test The Harness Thesis
The next step for software organizations is likely to be practical validation: measuring whether better specs, evals, tool access, context policies and CI gates improve agent performance more than switching models alone. Teams using AI coding agents will also need to decide which parts of their harness they own internally and which they accept from vendors.
For readers, the immediate takeaway is that AI coding strategy is moving beyond model selection. The unresolved question is how quickly engineering organizations can build the verification systems needed to make AI-generated code reliable at scale.

Secure AI Model Deployment: A Comprehensive Guide to Safely Delivering Machine Learning Systems in Production Environments
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
What did Google’s whitepaper say about AI coding?
It argued that software development is shifting from manually writing code to expressing intent and directing AI systems, with verification and workflow design becoming central to quality.
What does “the model is only 10%” mean?
It means the paper treats the model as only one part of an agent. The rest is the surrounding harness: prompts, tools, context, tests, evals, sandboxes and observability.
Is vibe coding the same as AI-assisted engineering?
No. In the paper’s framing, vibe coding is the loose, high-risk end of the spectrum. Agentic engineering adds specifications, tests, evals, CI gates and human oversight.
Are the adoption numbers confirmed independently?
The figures cited here are attributed to the Google whitepaper and its source material. Independent confirmation is not included in the provided material.
Why does this matter for software teams?
If the paper’s argument holds, teams may get more value from improving verification and agent setup than from waiting for the next model release.
Source: Thorsten Meyer AI