bug bounty
OPENAI'S FORMAL HARM-REPORTING PROGRAM, LAUNCHED THIS WEEK
OpenAI launched its Safety Bug Bounty program on Wednesday — a formal channel for researchers to report harms with payouts attached. Sora safety guidelines updated the same week. The Foundation structure was formalised. Three governance posts in three days is not a coincidence; the EU AI Act's Article 73 reporting requirements come into force in August.
A provider building harm-reporting infrastructure that regulators will soon mandate anyway
On Wednesday, OpenAI launched its Safety Bug Bounty — a formal channel for outside researchers to report harms or misuse vectors, with payouts attached. On Monday, the company updated its Sora safety guidelines. On Tuesday, it published an update on the OpenAI Foundation structure. Three governance posts in three days is not a coincidence. Article 73 of the EU AI Act — mandatory serious-incident reporting — comes into force on 2 August. The infrastructure has to exist before then. OpenAI is building it ahead of the mandate.
The Safety Bug Bounty in particular is worth studying. It formalises something that until now existed only as informal Twitter threads and email lines: a way for outside researchers to report failure modes to the provider with payouts that reflect severity. Read the eligibility section carefully — it lists exactly which categories of harm the provider considers in-scope and out-of-scope. The out-of-scope list is more revealing than the in-scope one. Sycophancy is not listed. Validation drift is not listed. Capability redirection on sensitive topics is not listed. The categories the provider will pay you to find are the ones the provider can fix; the categories that are missing are the ones the provider does not yet have a fix for.
The Sora safety update is the other piece of the puzzle. It sits on top of two months of provider concern about synthetic media misuse and lands the same week the Iranian-protester case (Trump's use of apparently AI-generated imagery to cast doubt on real human-rights footage) was still in active news cycles. The provider response is a tightening of the rails. The provider response is also, structurally, an admission of what the rails were doing previously.
Underneath it, the Foundation update is the boring infrastructure piece — a corporate-governance post that nobody will read in full and that matters for exactly the reason corporate-governance posts matter. The Foundation is the entity that will hold the equity if any of the structural reforms OpenAI announced last year actually land. Bookmark the post; you will be referenced back to it in a future filing.
A Safety Bug Bounty lists exactly which categories of harm the provider will pay you to find. The categories it does not list are the ones the provider does not yet have a fix for.
Want to spot this in your own conversations?
CLEAR is the free six-lesson course on the patterns AI quietly runs on you.
Take the course →
Founder's note — Three governance posts in three days from one provider is a tell. The Article 73 clock is the explanation. Build your own equivalents; they'll be regulatory-table-stakes by Q4.
◆The Notebook
A real bug-bounty program for AI harms, with severity-tiered payouts and a defined eligibility list. Read the out-of-scope categories carefully — they are the most informative part of the launch.
via OpenAI blog
Tighter rails on Sora, the week the synthetic-protester storyline was still warm. Worth reading for what the new rails will and won't catch — the gap is the part that still needs editorial discretion.
via OpenAI blog
A short corporate-governance update that matters because the Foundation is the entity that will hold equity if the structural reforms announced last year actually land. Bookmark it.
via OpenAI blog
◆Worth Your Time
OpenAI
Read the out-of-scope list before the in-scope list.
OpenAI
Worth reading for what the new safeguards will and will not catch.
OpenAI
Dry corporate-governance read. Important for the same reason dry corporate-governance reads always are.
EU AI Act
The mandate that explains the provider cadence above.
AIID
The most useful free public registry. If you operate any consumer-facing AI product, skim once a fortnight.
The Probe · Test Yourself
You launch a bug bounty for AI harms in your own product. Which signal is most reliably correlated with a healthy program — one that actually surfaces failure modes you didn't already know about?
AHigh volume of submissions, mostly low-severity
BA steady stream of submissions across multiple severity tiers
CA small number of submissions, all high-severity
DMost submissions filed by your own team
Reveal the answer
Answer: B — A steady stream of submissions across multiple severity tiers
A indicates a program that's found by spammers and rewards triviality. C indicates a program nobody outside trusts. D indicates a program that hasn't reached external researchers. B — submissions distributed across severity tiers, from external reporters — is the signature of a program that's functioning as an external sensor.
Reply and tell me what you've noticed. If you've run an AI safety bounty internally, send me the structure that worked. I'm collecting patterns for an Article 73 readiness brief.
Free where it can be. Honest where it has to be.
— Three places to go from here —
Course
CLEAR
Six free lessons on the patterns AI runs on you.
Start →
Tool
LiveScope
Chrome extension that flags what AI cites without checking.
Install →
Read
The Agreement Trap
15-chapter book on living inside the exchange. £5.99 lifetime.
Read →