What Spark actually is (and is not)
Google announced Gemini Spark at I/O 2026 on May 19, and made it available the following week to Google AI Ultra subscribers at the newly reduced $100/mo tier (Ultra came down from its prior pricing to bring it into the same range as competing premium AI subscriptions). The shape of the product is worth getting precisely right, because the headline personal AI agent undersells the structural shift.
Spark is not the Gemini app with a longer leash. It is a separate runtime that:
- Runs continuously on dedicated Google Cloud virtual machines — the agent's process persists when the user closes the tab, signs out of Workspace, or shuts the laptop.
- Has its own dedicated email address inside Workspace, so it can receive, parse, and reply to mail without piggybacking on the user's identity for every send.
- Executes tasks on schedules and triggers — every weekday at 7 AM, assemble a briefing from inbox plus calendar; every time a message from a top-tier customer hits this inbox, draft a reply against the playbook and queue it for review.
- Connects to Gmail, Docs, Sheets, Slides, and Workspace out of the box, and to third-party services via Model Context Protocol — Gemini Enterprise's existing connector library covers Microsoft SharePoint, OneDrive, ServiceNow, and a growing list — with broader MCP integrations rolling through the coming months.
The runtime is built from a Gemini base model wired into the agent harness Google originally developed for Antigravity. That's the architectural through-line: the harness that Google built for developers running multi-step agentic work in an IDE has been productized for a consumer running multi-step agentic work in their digital life. The harness is the same shape. The user is different. The integration surface is enormous.
The always-on agent is a different kind of caller
For most product teams whose mental model of an AI integration was formed in 2024, the implicit assumption was: the AI is something the user invokes. The user opens a tab, asks a thing, the AI responds, the session ends. Even the agentic IDE work — Cursor, Claude Code, Windsurf, Antigravity — fit this shape: the agent runs while the developer is watching, completes a task, the session unwinds.
Spark breaks the assumption in three ways at once, and each break has consequences for any SaaS product whose API the agent might touch.
It runs asynchronously. A Spark task can fire at 3 AM, in response to an inbound email, on a schedule, or when a watched condition is met. There is no user in the loop at the moment of the API call. The audit log entry says the agent acted on the user's behalf. The user was asleep.
It has a stable identity. The dedicated email address is the visible part. The deeper part is that the agent has long-lived credentials, persistent state, and standing authorizations against the integrated services. From the SaaS provider's perspective, it is a service account that happens to belong to a single end-user. The threat model and access-control surface that applies is closer to third-party API integration than to user session.
It composes across products. A single Spark task reads from Gmail, queries a customer record in a CRM via MCP, drafts a reply in Docs, and sends through a scheduling tool — in one chain, on behalf of one user, with one set of authorizations. Whichever of those products is the weakest link on auth, audit, or rate-limiting is the one that has to deal with the full composite of every other product's worth of agent activity flowing through it.
Each of those properties is fine in isolation. In combination they describe a class of caller that most SaaS auth surfaces, audit pipelines, and product-analytics dashboards were not designed for.
Four decisions every SaaS team should write down this quarter
If you ship a SaaS product whose customers are likely to deploy Gemini Spark — and given Workspace's distribution, that is most B2B SaaS products — there are four decisions worth making explicitly before agent traffic from Spark hits volume.
The MCP surface you expose. Spark connects to third-party services through MCP. Your product is either going to ship an MCP server that exposes a deliberate subset of your API, or someone — a community contributor, a customer's internal tools team, an enterprise integrator — is going to ship one without your involvement. If the first official surface that controls how Spark interacts with your product is built by somebody else, you have a generation of behavior shaped by other people's decisions about which endpoints matter. Ship your own MCP server, scoped deliberately, before you find out.
The scopes you grant. A Spark agent connected to your product can be granted read-only, read-write, or full administrative scope. Most products today have OAuth flows that default to all the things because the implicit user was a human who would notice if the integration misbehaved. Spark is not that user. Default scopes for agent integrations should be conservative — read everything, write only with confirmation, never administrative — and the product should make it loud and visible when an agent is asking for more.
The audit identity you log. When Spark calls your API on behalf of user@customer.com, the audit log entry has to record both — the user who authorized the agent and the fact that the action was taken by the agent, asynchronously, with a specific Spark task ID. The legacy log entry that just says user@customer.com performed action X at 3:02 AM is the entry that becomes useless in an incident review six months from now. Every action attributable to a Spark task should carry a structured marker — actor type, agent ID, task ID, source app — that survives the audit pipeline intact.
Per-agent rate limits, not per-user. A human user's API consumption is bursty around moments of activity and otherwise quiet. A Spark agent's consumption is continuous and proportional to the standing instructions it has been given. Your existing per-user rate limits were calibrated for the former pattern and will misbehave for the latter — either too tight (Spark hits the cap and breaks the user's workflow at 3 AM), or too loose (a misconfigured Spark task pulls down your read pipeline trying to process a 50,000-row sheet). Per-agent rate limits, distinct from per-user limits, are the right primitive.
None of these decisions are exotic. All of them are decisions teams have made before — for first-party mobile apps, for Zapier-style integrations, for partner APIs. The thing that's new is that they need to apply to the agent-on-behalf-of-user pattern at consumer scale, and most product teams have not done that work because no consumer agent at this scale existed until now.
The "who is the user?" question is changing shape
The quieter implication of Spark is that the question your product's identity layer was answering for the last decade — who is doing this action? — is gaining a second axis. The user is still there, on the authorization side. But the actor on the action side is now sometimes the user, sometimes the agent acting on the user's behalf, and increasingly often a cascade of agents (Spark calls a CRM, which calls an analytics tool, which calls a notification service).
Three implications worth writing down before the product debt becomes deployment-blocking.
The legal frame around accountability for AI-mediated actions is going to follow the audit log. If your audit log can't say which agent did this thing, and which standing instruction or prompt caused it, your customers' compliance teams are going to push you to add that capability under pressure when an incident reveals the gap. Building the structured audit identity now is much cheaper than retrofitting it under incident review.
Customer support is about to start fielding questions like why did Spark schedule this meeting? or why did it send this email? — questions where the answer requires reconstructing the agent's reasoning trace, not just the API call log. Products that make it easy for a user to see which Spark instruction led to a given action in our product will deflect a class of support tickets that would otherwise sit in the queue.
Pricing and packaging may need adjustment. A SaaS seat that was priced for one human's activity may need a different price band for human plus always-on agent, because the API and resource consumption profile is materially different. Most products will resist this conversation for a year and then find themselves repricing on a customer-by-customer basis. Doing the modeling now is worth it.
What Spark does not solve
Three honest limits worth naming.
It does not solve accountability. When Spark sends an email at 3 AM and the recipient pushes back on the contents, the question of who authorized that exact message still has to be answerable, and the answer is still about your product's design — your routing rules, your confirmation gates, your policy for which actions are autonomous and which are escalated. Spark hands you a faster way to act; it does not, and cannot, hand you the policy layer that decides which actions are autonomous and which are escalated.
It does not solve quality drift. A Spark task running unattended for ninety days, against a slowly shifting input distribution (new customer types, new email patterns, new internal vocabulary), will quietly degrade in ways the user won't notice until an embarrassing reply ships. The same eval discipline that applies to agentic IDE work applies here, but the discipline is harder to enforce on a consumer product where each user's agent is configured idiosyncratically.
It does not solve cross-product trust. Spark composes across products by trusting each one's MCP server. A misbehaving or compromised MCP server in the chain is a place where an agent might be tricked into actions the user didn't intend. The defense is real adversarial review of the prompt and tool-call surface of every server the agent is allowed to talk to — exactly the kind of review most SaaS teams don't do today on their own integration surface.
Where Sonnet Code fits
The always-on personal agent shipping at consumer scale is the easy half of the story. The hard half is the engineering above the integration surface — the MCP server design, the scoped authorization model, the audit identity, the rate-limiting and quota primitives, the adversarial review of the tool surface — that decides whether the Spark traffic hitting your product is a leverage point or a liability surface. AI development at Sonnet Code is that engineering: designing and shipping the MCP server that exposes your product safely to Spark and every agent like it, wiring scoped permissions and structured audit identity into your existing auth stack, and building the per-agent rate-limiting and observability that makes the new traffic class visible. AI training is the human-judgment half: senior security engineers and domain experts who design the failure-mode rubrics for agent-driven workflows in your product surface, run the adversarial review on the prompt and tool-call paths an attacker is most likely to find, and stand up the senior-reviewer queue for the actions that warrant human escalation.
Always-on personal agents just became a real product category. Every B2B SaaS surface you own is now an agent surface. The question is no longer will agents show up? — they will, this quarter, in volume. The question is whether your MCP surface, your auth model, your audit log, and your eval discipline are ready to make that traffic safe and useful by default. That's the layer worth building deliberately, this quarter.

