Top Stories

Gemini app rolling out ‘Extended' thinking level, new 3rd-party app integrations
IMAP

May 18, 2026 09:57 AM

Google is rolling out a new 'Thinking level' option for Gemini. The option has appeared for some users when they select Fast or Gemini 3.1 Pro. Google is also preparing to add more integrations with third-party apps in Gemini. Support for Canva, Instacart, and OpenTable appears to be coming.

Read More
Codex will soon be able to control other desktop devices via Computer Use
IMAP

May 18, 2026 09:57 AM

OpenAI is working on a capability that lets its coding agent operate macOS applications through Computer Use even when a laptop is locked or asleep. Computer Use currently requires an unlocked, awake session to see the screen, move the cursor, and type. Lifting the restriction will allow users to direct their agents without having to walk back to their machines to log in first. It is unknown when the feature will be released.

Read More
ChatGPT Personal Finance
IMAP

May 18, 2026 09:57 AM

OpenAI released a preview of a new personal finance experience in ChatGPT for Pro users in the US. The feature lets users securely connect financial accounts, view spending dashboards, and ask questions grounded in their financial context and goals.

Read More
Tokenomics: the 62.5-minute rule for Claude's cache
IMAP

May 18, 2026 09:57 AM

If you expect to need a cache before 62.5 minutes, refresh it. Otherwise, let it expire. This number stays the same between models, and it doesn't change, no matter the size of the cache. The amount of dollars may change, but the decision point is still the same.

Read More
AI economics part 2
IMAP

May 18, 2026 09:57 AM

AI labs are in an ongoing war over GPU resources. That article looks into demand and supply and how the infrastructure powering AI today may not be sufficient. Scaling GPUs doesn't scale compute linearly. Efficiency matters more at raw scale given finite supply.

Read More
Portability Is a Myth: Why the Best AI Stacks Will Never Be Hardware-Agnostic
IMAP

May 18, 2026 09:57 AM

AI kernel portability is structurally impossible because TPU's Pallas, NVIDIA's CuTile and CUTLASS, AWS's NKI, AMD's FlyDSL, and Tenstorrent's tt-Metalium each expose hardware-specific concepts that no universal DSL can unify. The evidence: MaxText's MoE grouped matmul ships as 282 lines of Pallas on TPU while flashinfer's equivalent for Blackwell SM100 takes 4 million lines of generated CUDA, with zero shared code because the algorithms themselves diverge across hardware.

Read More
How Claude Code works in large codebases: Best practices and where to start
IMAP

May 18, 2026 09:57 AM

Claude Code is now being used in production across multiple large codebases in organizations with thousands of developers. These environments bring challenges that smaller codebases don't. This article covers patterns that Anthropic has seen that have led to the successful adoption of Claude Code at scale. It looks at how Claude Code has been used in monorepos with millions of lines, legacy systems built over decades, and microservices across separate repositories.

Read More
Notes on pretraining parallelisms and failed training runs
IMAP

May 18, 2026 09:57 AM

Pretraining runs often fail. This article looks at all the ways that things can go wrong and why training is such a precarious operation. The key culprits seem to be breaking causality and adding bias.

Read More
Recent Developments in LLM Architectures: KV Sharing, mHC, and Compressed Attention
IMAP

May 18, 2026 09:57 AM

KV-cache size, memory traffic, and attention cost quickly become the main constraints as reasoning models and agent workflows keep more tokens around for longer. LLM developers are adding a growing number of architecture tricks to reduce costs. Most of the changes look like small tweaks, but some are quite intricate design changes. This article looks at these architecture changes with a focus on what changes inside the transformer block, residual stream, KV cache, and attention computation.

Read More
Lighthouse Attention
IMAP

May 18, 2026 09:57 AM

Lighthouse Attention, a selection-based hierarchical attention, offers up to 17x faster forward and backward passes than standard attention models at large contexts. It utilizes FlashAttention on a dense sub-sequence, maintaining efficiency and compatibility with upstream improvements. By enabling efficient long-context training and retaining dense model competence, Lighthouse Attention achieves 1.4x to 1.7x speedup in pretraining while reducing computational costs.

Read More
Runway started by helping filmmakers — now it wants to beat Google at AI
IMAP

May 18, 2026 09:57 AM

Runway's founders believe that the next form of AI will be built from video and world models that learn how the world works. The company is training models directly on observational data to reach the next frontier of AI. Runway was one of the first to develop AI video generation, but world models are a different race with deep-pocketed competitors. The company has raised $860 million to date, but it is going against incumbents like OpenAI and Google.

Read More
The haves and have nots of the AI gold rush
IMAP

May 18, 2026 09:57 AM

The AI boom has created a wealth divide, with an estimated 10,000 individuals from companies like OpenAI and Nvidia achieving over $20M in wealth, while others face uncertain futures with stagnant job prospects and layoffs. Software engineers express concerns about their skills becoming obsolete, raising anxiety about career paths. This disparity fuels tension in San Francisco's tech scene as some criticize the dual role of AI as a wealth source and a career threat.

Read More
TLDR is hiring a Senior Software Engineer, Applied AI
IMAP

May 18, 2026 09:57 AM

Learn more

Read More
Apple Silicon costs more than OpenRouter
IMAP

May 18, 2026 09:57 AM

Openrouter costs about 1/3 the price at around 2x the speed for comparable models.

Read More
Headroom
IMAP

May 18, 2026 09:57 AM

Headroom compresses everything an agent reads before it reaches the LLM to produce the same answers at a fraction of the tokens.

Read More
DeepSeek-V4-Flash means LLM steering is interesting again
IMAP

May 18, 2026 09:57 AM

Steering is the idea that LLM outputs can be guided by directly manipulating the activations of a model mid-flight.

Read More
OpenAI Quietly Bought Voice-Cloning Startup Weights.gg, Then Folded the Team
IMAP

May 18, 2026 09:57 AM

OpenAI acquired the six-person team and its intellectual properties, then shut down Weights.gg and dispersed its team across multiple OpenAI groups.

Read More
Apply here
IMAP

May 18, 2026 09:57 AM

Jacob Turner

Read More
create your own role
IMAP

May 18, 2026 09:57 AM

Jacob Turner

Read More
Inc.'s Best Bootstrapped businesses
IMAP

May 18, 2026 09:57 AM

Jacob Turner

Read More