March 25, 2026 09:18 AM
Anthropic released Auto Mode in research preview, enabling Claude to autonomously execute actions with built-in safeguards that filter risky behavior and prompt injection.
Read MoreMarch 25, 2026 09:18 AM
OpenAI shifted away from its in-chat checkout feature after low adoption, prioritizing product discovery and merchant-directed purchasing flows instead.
Read MoreMarch 25, 2026 09:18 AM
Anthropic's Prithvi Rajasekaran developed a multi-agent architecture to improve AI-driven frontend design and full-stack application coding, addressing issues of coherence and self-evaluation. Inspired by GANs, this approach uses planner, generator, and evaluator agents to produce complex, high-quality outputs by decomposing tasks and utilizing structured handoffs. Despite improvements, challenges remain in context management and evaluator tuning, highlighting the ongoing need for adapting harness designs as AI models advance.
Read MoreMarch 25, 2026 09:18 AM
The App Store was a centralized answer to the distribution problem of a new computing platform. The agent era will need a new solution as agents need APIs, not app stores. Apple gained its revenue by forcing every in-app transaction through its payment system. The agent era lacks Apple's lock-in mechanics, so if one platform tries to charge high payment fees, users will just switch to a competitor. This suggests the payment layer will be competitive and low-margin rather than monopolistic.
Read MoreMarch 25, 2026 09:18 AM
As of March, Claude 4.6 features a 1M token context window and four distinct modes: Chat, Cowork, Code, and Projects. The Cowork suite automates workflows via Scheduled Tasks and Connectors, while the Code environment utilizes CLAUDE.md hierarchy, MCP protocols, and Agent Teams for autonomous development. Key upgrades include Computer Use research previews and deterministic Hooks for programmable guardrails.
Read MoreMarch 25, 2026 09:18 AM
Many of the modern workloads that LLMs are increasingly utilized for prioritize throughput over per-request latency, which many LLM systems and deployments optimize for today. Ray Data LLM is a library built for large-scale batch inference for LLMs. It provides scalable execution, high throughput, and fault tolerance. It has a highly optimized architecture for running LLM batch inference. Users can achieve 2x throughput with Ray Data LLM over vLLM's synchronous LLM engine while benefiting from production-scale resiliency.
Read MoreMarch 25, 2026 09:18 AM
Ossature is an open-source harness for spec-driven code generation. Developers write specifications describing what their software should do, and Ossature validates them, has an LLM audit them for ambiguities and gaps, produces an editable plan, and then generates code one task at a time. Each task only gets the context it needs. Ossature has verification built into the build loop. If verification fails, a fixer agent gets the error output and tries to repair the code.
Read MoreMarch 25, 2026 09:18 AM
Directional updates in RLVR were shown to better identify reasoning-critical tokens, enabling both test-time extrapolation and training-time reweighting to boost accuracy.
Read MoreMarch 25, 2026 09:18 AM
TurboQuant is a quantization method that reduces vector memory overhead while preserving performance. This improves key-value cache efficiency and accelerates vector search.
Read MoreMarch 25, 2026 09:18 AM
Semantic calibration appears to emerge as a byproduct of next-token prediction. Base models are remarkably well-calibrated when using a certain sampling-based notion of semantic calibration. They can meaningfully assess confidence in open-domain question-answering tasks despite not being explicitly trained to do so.
Read MoreMarch 25, 2026 09:18 AM
OpenAI has announced a new $10 billion commitment from a16z, DE Shaw Ventures, MGX, TPG, and T Rowe Price. The fresh capital brings OpenAI's record fundraise to over $120 billion. OpenAI has moderated its spending plans and is now targeting approximately $600 billion in total compute spend through 2030. It is now taking steps to prioritize its most profitable initiatives ahead of an IPO.
Read MoreMarch 25, 2026 09:18 AM
US District Judge Rita F. Lin of the Northern District of California said during a court hearing that the US government appeared to be punishing Anthropic by banning the company. The hearing is part of Anthropic's efforts to ease the government ban on the use of the company's AI models. Lin has yet to rule on the matter but expressed serious doubts about the Trump administration's actions in her opening remarks. The government's action has already cost Anthropic hundreds of millions of dollars in canceled contracts and aborted customer agreements.
Read MoreMarch 25, 2026 09:18 AM
Lakewatch is a SIEM platform using AI agents for threat detection, alongside acquisitions of Antimatter and SiftD.ai to support secure agent deployment.
Read MoreMarch 25, 2026 09:18 AM
The Anthropic Economic Index shows Claude usage has diversified, with a drop in high-value tasks, shifting more to low-wage personal queries.
Read MoreMarch 25, 2026 09:18 AM
EVA is a framework for evaluating voice agents that evaluates complete, multi-turn spoken conversations using a realistic bot-to-bot architecture.
Read More