March 26, 2026 09:20 AM
Google's TurboQuant is a compression algorithm that reduces the memory footprint of large language models while also boosting speed and maintaining accuracy. It reduces the size of the key-value cache so it doesn't have to be recomputed. Early testing shows TurboQuant results in an 8x performance increase and 6x reduction in memory usage without a loss of quality. Compression techniques like TurboQuant could improve the quality of outputs of models for edge devices without having to send data to the cloud.
Read MoreMarch 26, 2026 09:20 AM
ARC-AGI-3 was designed to evaluate agentic intelligence via interactive reasoning environments. Beating it will mean an AI system matches or exceeds human-level efficiency on all environments upon seeing them for the first time. 100% of the environments are solvable by humans on first contact with no prior training or instruction. All frontier AI reasoning models currently solve under 1%.
Read MoreMarch 26, 2026 09:20 AM
Reflection is a startup leading an effort to create freely available US AI systems. It is one of a handful of Nvidia-linked startups seeking to build a network of open source AI models. The startup is in talks to raise $2.5 billion at a valuation of $25 billion. Investors describe Reflection as the 'DeepSeek of the West' as it offers an alternative to the open source models offered by Chinese companies.
Read MoreMarch 26, 2026 09:20 AM
Manus' co-founders, Xiao Hong and Ji Yichao, have been told not to leave China while authorities review the company's $2.5 billion sale to Meta. Early versions of Manus were created by engineers from a Chinese company. A Singapore-based entity then took over Manus' operations and relocated most of its China-based employees to Singapore, which made it possible for Meta to purchase it. Authorities are concerned that Manus' moves could encourage other Chinese companies to follow suit and move out of the country without vetting.
Read MoreMarch 26, 2026 09:20 AM
Open source models are reaching parity with frontier labs' models, making those labs' equity look overpriced if they're simply utilities. These frontier labs have enterprise agreements, safety certifications, distribution, research talent, and regulatory positioning, but that doesn't explain their moat. People focus on capability, but the number that actually matters for valuations is the monetizable spread, the subset of that capability delta that someone will actually pay a premium for. The monetizable spread is declining faster than the capability spread.
Read MoreMarch 26, 2026 09:20 AM
Quantized models are actually pretty good. A 16-bit to 8-bit quantization carries almost no quality penalty - the difference with a 4-bit quantization is more noticeable, but it would only perform about 90% as well as the original. It's worth experimenting with these models as they're much smaller and can run on more systems. This article explains how model parameters work, what quantization is, how quantization is applied in practice, and the effects of quantization on model accuracy.
Read MoreMarch 26, 2026 09:20 AM
The final training run for a model is only the last step in a long, expensive process. Before the run, companies burn compute on running experiments at various scales, generating synthetic data, testing ideas, and training unreleased models. The full cost of developing a model is much higher than the cost of the final training run of a frontier model. Most of the spend is on exploration rather than execution. Companies that learn from their competition can replicate their results for a fraction of the original cost.
Read MoreMarch 26, 2026 09:20 AM
OpenAI outlined the philosophy and structure behind its Model Spec, a framework defining desired model behavior, safety principles, and how systems should follow instructions and resolve conflicts.
Read MoreMarch 26, 2026 09:20 AM
Manthan Gupta built Auto-Inference-Optimiser to let an AI agent hill-climb on LLM inference speed while keeping quality fixed on Apple Silicon. Argmax sampling and simplifying inference code gave the largest throughput gains, while most tuning knobs and KV cache quantization hurt or had no effect. The project highlights that a disciplined, observable harness is critical for distinguishing real performance wins from noise or benchmark illusions.
Read MoreMarch 26, 2026 09:20 AM
Cognition's Devin is an AI software engineer that can build software from start to finish without human involvement. When it launched in 2024, it was considered a step toward a long-held Silicon Valley dream of a machine that codes for you. Cognition's CEO, Scott Wu, believes that the technology doesn't mean the end of software engineering. Rather than eliminate engineers, Cognition's tools will allow them to focus on the best parts of the job while sparing them from the grunt work that traditionally consumes most of their time.
Read MoreMarch 26, 2026 09:20 AM
OpenAI launched a public bug bounty program targeting AI misuse and safety risks, expanding beyond traditional security vulnerabilities to include abuse scenarios.
Read MoreMarch 26, 2026 09:20 AM
AI companies are shifting from narrow solutions to broad platforms, driven by rapid model changes.
Read MoreMarch 26, 2026 09:20 AM
Harvey raised $200 million in a new round led by GIC and Sequoia, bringing its valuation to $11 billion and total funding past $1 billion.
Read MoreMarch 26, 2026 09:20 AM
Lyria 3 Pro extends the maximum track length to three minutes and adds finer control over song structure and customization.
Read More