Top Stories

Grok 4.1
IMAP

November 18, 2025 11:26 AM

Grok 4.1 is the top model on LMArena. xAI touted its large-scale post-training infrastructure as responsible for targeted improvements to emotional intelligence and creative writing, which suggests it is pursuing consumer-focused AI companion use cases currently dominated by ChatGPT.

Read More
Arm custom chips get a boost with Nvidia partnership
IMAP

November 18, 2025 11:26 AM

Arm-based Neoverse CPUs can now be paired with Nvidia's GPUs using Nvidia's NVLink Fusion technology. Microsoft, Amazon, and Google are all developing or deploying Arm-based CPUs in their clouds. The partnership will make it easier for customers of both companies to pair hardware together. The announcement signals that Nvidia has decided to open up its NVLink platform instead of forcing customers to use its CPUs.

Read More
CUDA, Shmuda: Running AlphaFold on a MacBook Air
IMAP

November 18, 2025 11:26 AM

Apple's MLX framework is great for science workloads because it was designed from the ground up to take advantage of Apple's unified memory architecture. Models like AlphaFold are memory-intensive, and almost all protein software is heavily optimized for CUDA. This post shows how a developer got AlphaFold to run on an Apple M chip machine. ARM chips like the Apple M are extremely fast and use only a fraction of the wattage that a single CUDA card needs, and getting a Mac device is a lot easier than getting a Blackwell GPU.

Read More
Roadmap for Video-Based World Models
IMAP

November 18, 2025 11:26 AM

This roadmap outlines the progression of video generation systems into full-fledged simulators that combine internal physics-aware world models with visual rendering. It defines four developmental stages toward interactive and stochastic virtual environments.

Read More
RL is even more information inefficient than you thought
IMAP

November 18, 2025 11:26 AM

It takes way more FLOPs to get a single sample in RL than it does in supervised learning. RL requires unrolling a whole thinking trajectory tens of thousands of tokens long to get a single reward signal at the end. In pretraining, you get a signal on every single token you train on. For most of training, the information density per sample is way lower for RL compared to supervised learning.

Read More
The Agent Labs Thesis
IMAP

November 18, 2025 11:26 AM

Agent Labs primarily research and sell agents. They put product first and use outcome-based pricing, as opposed to Model Labs, which put models first and price per token. Agent Labs have better cashflow economics, but it might take longer to see exit valuations. The Model Lab mission may be shifting, at least until the next big algorithm shift.

Read More
WeatherNext 2: Our most advanced weather forecasting model
IMAP

November 18, 2025 11:26 AM

WeatherNext 2 is an AI model that delivers efficient, accurate, and high-resolution global weather predictions. It can generate forecasts with a resolution of up to an hour. The model's forecast data has been released. WeatherNext 2 will be integrated to help power weather information in Google Maps in the coming weeks.

Read More
Build a GPT-5.1 Coding Agent
IMAP

November 18, 2025 11:26 AM

This guide walks through building a coding agent using GPT-5.1 and the Agents SDK, leveraging tools like shell execution, patch editing, web search, and Context7 MCP for live documentation access.

Read More
Intelligence per Watt: Measuring Intelligence Efficiency of Local AI
IMAP

November 18, 2025 11:26 AM

Frontier models today are served through massive, centralized cloud computing services that are straining electric grids. The performance gap between these large models and much smaller models, the kind that can run on a phone or laptop, is shrinking.

Read More
AA-Omniscience
IMAP

November 18, 2025 11:26 AM

AA-Omniscience is a benchmark for knowledge and hallucination across more than 40 topics. It punishes hallucinations by deducting points when models guess rather than admitting they do not know the answer. The benchmark complements the Artificial Analysis Intelligence Index by incorporating the measurement of knowledge and probability of hallucination. This thread contains the benchmark results from testing cutting-edge models against AA-Omniscience.

Read More
Replicate is joining Cloudflare
IMAP

November 18, 2025 11:26 AM

Replicate, a platform for running AI models, is joining Cloudflare. Existing Replicate users will be able to continue work without interruption. Existing Workers AI users will soon receive a massive expansion of the model catalog and the new ability to run fine-tunes and custom models directly. Cloudflare's AI Gateway will provide users with a single control plane for observability, prompt management, A/B testing, and cost analytics across all of their models, whether they're running on Cloudflare, Replicate, or any other provider.

Read More
Google to enable research automation on Gemini Enterprise
IMAP

November 18, 2025 11:26 AM

The unreleased system generates roughly 100 ideas on a topic, then spawns agent teams that compete in a tournament-style bracket for the best result.

Read More
OpenAI is finally letting employees donate their equity to charity
IMAP

November 18, 2025 11:26 AM

Current and former OpenAI employees with eligible shares will be able to donate their shares for a short period.

Read More
Databricks Seeks $130 Billion Valuation
IMAP

November 18, 2025 11:26 AM

Databricks is in talks to raise funds at a valuation above $130 billion, though nothing has been signed yet.

Read More
New ways to plan travel with AI in Search
IMAP

November 18, 2025 11:26 AM

Canvas in AI Mode now builds complete travel itineraries by pulling together flight data, hotel comparisons, Maps reviews, and restaurant recommendations.

Read More
Apply here
IMAP

November 18, 2025 11:26 AM

Jacob Turner

Read More