I Missed Another Release
← NotesAt some point in the last two years, keeping up with what's shipping became a part-time job. A new model drops, you spend an afternoon reading about it, and by the time you look up, two more have been announced. This isn't a complaint. It's more of a documentation.
What shipped
Every benchmark from the prior era became obsolete overnight.
Anthropic enters the race. Constitutional AI framing introduced.
100K context window. Long documents become tractable.
Vision comes to GPT-4. Images in, text out.
OpenAI lets anyone build a chatbot. App store energy.
Google's unified model family. Claimed GPT-4 parity on benchmarks.
OpenAI video generation. Technically impressive, not yet widely accessible.
One million token context window. Long-context reasoning at scale.
Free Gemini API access for prototyping and building. I use this.
Opus beats GPT-4 on most evals. Haiku is fast and cheap.
Omnimodal. Native audio, vision, and text in one model.
Fast and cheap multimodal. Google's answer to GPT-4o mini.
Beat Claude 3 Opus at a fraction of the cost. The benchmark shifted again.
Interactive code, docs, and diagrams rendered live inside the conversation.
Fast, cheap, good enough for most tasks.
OpenAI's reasoning model. Chain-of-thought at inference time.
Google's podcast-from-any-document feature. Went viral immediately.
Anthropic releases screenshot-based computer control. Agents can now operate UIs.
Side-by-side editing for writing and code inside ChatGPT.
AI-native code editor. Changed how a lot of developers work daily.
Anthropic's open standard for tool use. De facto interface for agent integrations.
Fastest model in the 3.5 family.
Web search built into the chat. One less tab open.
Multimodal, fast, capable of tool use and computer interaction.
OpenAI's reasoning model, full release.
Open-weights reasoning model matching o1 at a fraction of the cost. I used this one.
OpenAI's next reasoning model. Significantly ahead of o1 on hard tasks.
Extended thinking mode. Long chain-of-thought available on demand.
OpenAI's largest non-reasoning model.
Google's strongest model to date. Top of most coding and reasoning benchmarks.
Anthropic's current frontier models.
Fast, efficient, strong across modalities.
Multi-step research agents from Google and OpenAI. Browse, synthesize, report.
Google's photorealistic image generation. Up to 2K resolution.
Google's video generation model. Text-to-video at production quality.
Anthropic's CLI-based coding agent. Full tool use in the terminal. I use this every day.
When an agent confidently repeats the same failed approach in a loop. You learn to spot it.
Design-to-code via MCP. Designers annotate components, engineers get mapped implementation.
Google's latest generation. Flash now at frontier-class performance.
Added native SVG generation. Text-to-vector without a tool chain.
Reading that back, I notice two things. First: this list isn't complete. I've definitely missed things. Second: I can use almost all of these tools right now, today, to build something real. That second fact is extraordinary, and I don't want the first to overshadow it.
The pace is disorienting. It's also the most interesting moment to be building in that I can imagine. Those two things are both true, and I hold them at the same time most days.