Most California agencies are no longer asking whether to use AI. Employees are already pasting work into chatbots, and vendors are already embedding models into the products agencies buy. The real challenge has shifted: how do you take the experiments already running across your organization and move them into production, safely, defensibly, and at scale?
.png)
Most California agencies are no longer asking whether to use AI. Employees are already pasting work into chatbots, and vendors are already embedding models into the products agencies buy. The real challenge has shifted: how do you take the experiments already running across your organization and move them into production, safely, defensibly, and at scale?
This guide lays out a practical playbook for doing exactly that, drawn from a recent conversation between Darwin AI's Chief AI Officer, Dustin Haisler, and Linxar Senior Managing Director, Bharat Bagaria. Watch the full webinar replay here.
California moved first. Executive Order N-12-23 staked out a position on safe, responsible public-sector AI, directing procurement reforms and introducing risk inventories and ethical guardrails. The statewide GenAI policy and the "Choose Your Own GenAI Journey" framework followed, and the legislative pipeline hasn't slowed since, roughly 30 AI-related bills have crossed into a second chamber, reaching from state operations down to local government. Add a federal government signaling it wants a say in which models even reach the public, and agencies face a patchwork: overlapping rules, moving deadlines, and genuine uncertainty about whose law ultimately governs.
It's tempting to treat that uncertainty as a reason to wait for the courts. It isn't. The rules you already operate under, acceptable use, data security, open records, apply to AI right now, which means inaction carries its own liability. The agencies pulling ahead aren't the ones with the cleverest new policy. They're the ones building on the foundation they already have, in a way that can flex as the rules change.
A pilot that works in a demo and a system that works in production are two very different things. Three gaps account for most of the failures in between.
For a lot of teams, "governance" means a document everyone signs and no one reads. Operational governance is different, and it starts with something concrete: an honest, real-time picture of where AI actually lives in your environment, not what a survey claims, but what employees and vendors are genuinely doing. As Dustin put it, "You can't govern what you can't see."
From there, the goal is guardrails that work at the level of everyday behavior. His analogy is a useful test: deploying AI today is "like giving every government employee a McLaren F1, and putting them on an open road with no speed limit signs." Good governance doesn't confiscate the car, it sets boundaries clear enough that no one has to look them up in a manual.
In practice, that means intercepting risk in the moment rather than after the fact. Picture an employee pasting a constituent's email into a free chatbot to draft a faster reply: effective tooling redacts the PII before it leaves, flags the unsanctioned tool for IT, and coaches the employee without shaming them or shoving the behavior onto a personal device, which is exactly where over-restriction sends it. And when you do roll out policy, lead with the why; a rule that arrives with no context reads like punishment and kills adoption.
There's no single right operating model. Agencies succeed with embedded governance, a center of excellence, a dedicated chief AI officer, or, increasingly at the state level, a hybrid of all three. The right structure is adaptive, shaped to your culture rather than stamped from a template.
Social-services eligibility is a useful example because it maps cleanly onto the playbook. Today a caseworker hops across several aging systems to verify citizenship, income, and household details, run fraud checks, and reach a decision, work that can stretch to days per case.
The path to production follows a repeatable arc:
The payoff in one comparable deployment: average processing dropped from about four hours to three; roughly 20% saved per case, which across 200,000 cases produced a three-to-four-times return. Two principles make the difference. Don't pursue AI for its own sake; anchor every use case to a real problem you can measure before and after. And give each effort one clear owner, govern by committee and too many hands turn into groupthink, and nothing ships.
The most common budgeting mistake is assuming the software is the cost. It isn't; licensing typically runs just 10 to 15% of the total. The real money goes to data cleanup and migration (30–40%, and some California efforts spent two years getting data ready), plus the transformation, governance, and IV&V scaffolding (another 20–25%). Then there are the line items teams routinely overlook: change management, which has to begin before the first proof of concept, and ongoing model monitoring, which is heavier than anything in a traditional application stack. Budget only for the visible 20% and you'll be writing a budget change proposal before the project ships.
A small agency without a dedicated AI team is often better positioned than a large one, because it can move. You don't need a chief AI officer; you need a clear owner, a defined process, and a few guardrails. Get resourceful, partner with a local college looking for real-world test cases, or band together with a neighboring city or county to share the load.
Wherever you sit, the first moves are the same: see the AI already in your environment, apply the rules you already have, and bring your people along with honest change management. Inventory what's happening, then pick one use case and run it well.
Darwin AI and Linxar are partnering to help California agencies turn governance from a PDF into operational reality. To go deeper, watch the full webinar replay, then reach out for the follow-up resources from the session—a use case prioritization scorecard, a cost and ROI model, and a first 100-day plan.