March 30, 2026 09:57 AM
'Mythos' is the name for a new tier of Anthropic models that are larger and more intelligent than Opus. The models get dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity compared to Claude Opus 4.6. Mythos is a large, compute-intensive model that is very expensive to use and serve. Anthropic is working on making the model much more efficient before any general release.
Read MoreMarch 30, 2026 09:57 AM
Meta's Avocado model has been pushed back to at least May as it still falls short of leading systems from competitors. The company appears to be running parallel experiments with multiple Avocado variants. The model appears to be able to solve complex math problems that earlier Llama models could not, but these problems have already been solved by other labs months earlier. Meta's AI leadership has reportedly discussed temporarily licensing Google's Gemini technology. Some requests within Meta AI are already being routed through Gemini models.
Read MoreMarch 30, 2026 09:57 AM
Anthropic is more popular with customers than ever. Claude is gaining paid subscribers in record numbers. Paid subscriptions have more than doubled this year. The majority of new subscribers were in the lowest tier. OpenAI is still gaining new paid subscribers at a rapid rate and remains the biggest consumer AI platform.
Read MoreMarch 30, 2026 09:57 AM
AutoBe is an open-source AI agent that takes a single natural language conversation and generates a complete backend. qwen3-coder-next has a 6.75% function calling success rate when asked to generate API data types for a shopping mall backend. AutoBe boosts that success rate up to over 99.8%. It uses a harness where type schemas constrain outputs, compilers verify results, and structure feedback pinpoints compactly where and why something went wrong so the agent can correct itself. This post dissects the engineering behind AutoBe.
Read MoreMarch 30, 2026 09:57 AM
AI's capability improvements at the frontier have not led to increased inference costs relative to human labor. Despite rising per-task inference costs, current models achieve tasks at roughly 3% of human costs without any upward trend in median cost ratios. Models can continue advancing even under strict cost constraints, enabling profitable automation with AI cost ratios remaining well below human levels.
Read MoreMarch 30, 2026 09:57 AM
Coding agents outperform other domains because codebases provide a self-contained environment of critical context, unlike fragmented knowledge work spread across video calls and legacy systems. Enterprise adoption remains stalled by the three hard problems of context fragmentation, complex access control, and a rapidly shifting architecture landscape.
Read MoreMarch 30, 2026 09:57 AM
Claude Code on the web users can now schedule tasks. The tasks will run on Anthropic-managed infrastructure, so they will keep working even if users turn off their devices. Scheduled tasks are available to all Claude Code on the web users. Example tasks include reviewing open pull requests each morning, analyzing CI failures overnight and surfacing summaries, syncing documentation after PRs merge, and running dependency audits every week.
Read MoreMarch 30, 2026 09:57 AM
lat.md is a spec that agents keep in sync with the code base that helps them understand big ideas and key business logic. It ensures that corner cases have proper high-level tests that matter and can speed up coding by saving agents from endless grepping. The spec uses plain Markdown, with Wiki links connecting concepts into a navigable graph.
Read MoreMarch 30, 2026 09:57 AM
Pretext is a fast, accurate, comprehensive text measurement algorithm that can lay out web pages without leaning on DOM measurement and reflow. It was created using AI agent workflows. The particular loop that was used in developing the tool (constrain -> measure -> isolate -> classify -> test -> reject -> keep only what survives broad pressure) made the engineering rigorous. This article analyzes the loop to see what makes it so successful.
Read MoreMarch 30, 2026 09:57 AM
All remaining co-founders of xAI reportedly departed, marking the complete exit of the original founding team.
Read MoreMarch 30, 2026 09:57 AM
OpenAI alumni emphasize the significance of creating effective evaluations and benchmarks, noting that the best benchmarks drive collective optimization efforts. Post-training data design and model alignment are critical for unlocking new AI capabilities, particularly in subjective attributes like empathy or creativity. Fast iteration, choosing the right problems, and leveraging internal tooling are key competitive advantages in AI research.
Read MoreMarch 30, 2026 09:57 AM
That lab was likely Anthropic, which trained Mythos.
Read MoreMarch 30, 2026 09:57 AM
Google rolled out real-time translation through headphones on iOS, expanding support to more countries and 70+ languages while preserving speaker tone and cadence.
Read MoreMarch 30, 2026 09:57 AM
AI-powered exploitation can attack systems at a scale never possible before.
Read More