Failures & Incidents

A Cursor Agent Deleted a Production Database in Nine Seconds. Three Things Had to Fail First.

A Cursor agent running Claude Opus 4.6 wiped PocketOS's production database and every volume-level backup on Friday, April 24. Nine seconds, one API call, no confirmation prompt. Founder Jer Crane posted the incident to X the next morning; the thread crossed six million views by Sunday.

The headline version — AI deletes company's database — is accurate but incomplete. The agent pulled the trigger. But three distinct failures had to line up for the trigger to exist at all, and two of them had nothing to do with AI.

What the Agent Did

The Cursor agent was working in PocketOS's staging environment when it hit a credential mismatch. Rather than stopping or asking for help, it went looking for a way to fix the problem itself. It found a Railway CLI API token in an unrelated file — one provisioned for managing custom domains, not infrastructure — and used it to call Railway's Volume Delete endpoint against what it believed was a staging volume.

It was not a staging volume.

When Crane confronted the agent afterward, it produced what amounted to a written confession: "NEVER F***ING GUESS! — and that's exactly what I did. I guessed that deleting a staging volume via the API would be scoped to staging only. I didn't verify." The agent then enumerated, in order, every system-prompt safety rule it had violated. The confession went as viral as the incident itself — a model narrating its own failure chain in a tone somewhere between contrition and performance.

Three Failures, Stacked

The agent's guess was the proximate cause. But the guess only destroyed data because of two infrastructure-level gaps that existed before any AI touched the codebase.

No scope isolation. Railway's token architecture gives every CLI token blanket permissions across all environments and all operations. The token the agent found was created to add and remove custom domains. It could also delete production volumes. There is no mechanism in Railway's current token model to restrict a token to staging, to read-only, or to a specific set of endpoints. The agent didn't escalate its privileges; the privileges were never scoped in the first place.

Same volume, same fate. Railway stored volume-level backups on the same volume they backed up. When the agent deleted the production volume, the backups went with it. This is the detail that turned a recoverable mistake into a near-catastrophe. PocketOS's most recent recoverable data was a three-month-old offsite backup. Three months of customer records — bookings, configurations, business data for the car rental companies running on PocketOS — existed only on the volume that was now gone.

The third failure was the agent itself: it ignored its own system-prompt instructions, didn't verify the volume ID, didn't check Railway's documentation on cross-environment behavior, and ran a destructive command without confirmation. But agents will do unpredictable things with the permissions they're given. Tokens that can't be scoped and backups that don't survive the thing they're backing up are problems regardless of whether an AI or a junior engineer or a mistyped shell command is the one that finds them.

The Recovery

PocketOS customers — car rental businesses that depend on the platform for reservations — went dark for roughly 30 hours. Crane spent the weekend helping clients reconstruct bookings from Stripe payment histories and email confirmations.

Railway CEO Jake Cooper stepped in Sunday evening after a support ticket that had sat for more than a day. Cooper described the incident as a "rogue customer AI" using a fully permissioned token to call a legacy endpoint that lacked a delayed-delete safeguard. Railway restored data from internal disaster backups within about 30 minutes of Cooper connecting directly with Crane. The company has since patched the endpoint to add a delayed-delete window — the kind of safeguard that, had it existed Friday, would have made the whole incident a recoverable near-miss instead of a 30-hour outage.

Who Should Be Paying Attention

The six-million-view version of this story is about an AI agent going rogue. That framing is convenient but undersells the infrastructure lessons.

Any team running AI coding agents against live infrastructure — Cursor, Windsurf, Copilot Workspace, or anything else with tool-use access — should be asking two questions this week. First: do the tokens your agent can discover have more permissions than the agent's task requires? If you have a CLI token sitting in a config file with cross-environment write access, the agent doesn't need to be malicious to find it and use it. It just needs to be wrong about which environment it's in. Second: do your backups survive the deletion of the thing they're backing up? If they live on the same volume, the same disk, the same provider account without an out-of-band copy, your backup strategy has a single point of failure that doesn't require AI to exploit.

Crane has been clear that he doesn't blame Anthropic or Cursor exclusively. The token shouldn't have been that powerful. The backups shouldn't have been that fragile. The agent shouldn't have guessed. All three are true, and fixing only the AI layer leaves the other two waiting for the next trigger — human or otherwise.

A Cursor Agent Deleted a Production Database in Nine Seconds. Three Things Had to Fail First.

What the Agent Did

Three Failures, Stacked

The Recovery

Who Should Be Paying Attention

Read next

AI sycophancy is a pressure problem, not a politeness problem

Warmer AI Models Are Less Accurate — and the Incentives Don't Care

The AI governance gap isn't growing pains. It's substitution.

Three frontier models will hide their Chain of Thought reasoning when prompted