Top Stories

From Clicks to Conversions: Architecting Shopping Conversion Candidate Generation at Pinterest
IMAP

April 30, 2026 06:08 AM

Pinterest built a dedicated two-tower retrieval model to generate better shopping ad candidates optimized for offsite conversions, moving beyond traditional click/engagement-based signals which are abundant but poorly correlated with actual buying intent. The system uses a unified multi-task architecture with parallel DCN v2 and MLP cross layers, clever training techniques to handle sparse and noisy conversion data, and an advertiser-level loss function.

Read More
How Vinted Serves Personalised Search Autocomplete
IMAP

April 30, 2026 06:08 AM

Vinted rebuilt its search autocomplete system, moving from static, generic suggestions to a hybrid approach combining a strong heuristic scoring model with a Learning-to-Rank (LTR) model. They score suggestions offline using popularity, sell-through rate, and usage signals, index them with clever prefix and fuzzy matching techniques, then apply a LightGBM model in real-time that incorporates user behavior and context to re-rank results.

Read More
Flow generation through natural language: An agentic modeling approach
IMAP

April 30, 2026 06:08 AM

Shopify Flow uses an AI agent that lets merchants build automation workflows using natural language instead of complex rules. Shopify significantly improved this agent by fine-tuning a smaller open-source model on their specific Flow domain data, resulting in much higher accuracy, lower latency, and lower cost than large general-purpose model.

Read More
Skipper: Building Airbnb's embedded workflow engine
IMAP

April 30, 2026 06:08 AM

Skipper is a lightweight, embedded workflow engine designed to provide durable and reliable execution for long-running business processes (like insurance claims and payments). Instead of relying on external orchestration tools or queues, Skipper uses a simple annotation-based approach to persist state in the service's existing database and achieves durability through deterministic replay.

Read More
GraphRAG beyond the demo: Lessons from the trenches
IMAP

April 30, 2026 06:08 AM

GraphRAG is most useful when questions require multi-hop reasoning across documents, entity relationships, or system-level dependencies: use Vector RAG for simple factual lookups and keep GraphRAG as an opt-in backend. In production, the main pain points are heavy indexing cost, difficult updates, multi-layer evaluation, and infrastructure that usually needs batch jobs rather than request-path execution. Success depends on selective graph scope, explicit update policies, repeatable evals, and strong observability/cost controls.

Read More
A/B Testing Pitfalls: What Works and What Doesn't with Real Data
IMAP

April 30, 2026 06:08 AM

A/B testing failures are far more often caused by broken infrastructure and poor experimentation practices than by the ideas being tested. Common failures include Sample Ratio Mismatch (SRM) from bad randomization, early peeking that inflates false positives, insufficient statistical power, and optimizing the wrong metrics without guardrails, causing misleading results.

Read More
Rocky
IMAP

April 30, 2026 06:08 AM

Rocky is a Rust-based tool that adds a control layer on top of data warehouses, helping teams manage pipelines with features like data contracts, lineage tracking, and safe testing through branches. It focuses on catching errors early, preventing data issues, and making data workflows more reliable and easier to understand.

Read More
oLLM
IMAP

April 30, 2026 06:08 AM

oLLM is a Python library for running very large context LLM workloads on modest consumer hardware by offloading model weights and KV cache to SSD instead of keeping everything in GPU memory. It's useful for offline tasks like analyzing long documents, logs, contracts, chats, or reports locally without quantization.

Read More
HOT Updates in Postgres
IMAP

April 30, 2026 06:08 AM

HOT Updates in PostgreSQL is a clever storage optimization that allows UPDATEs on unindexed columns to avoid touching indexes entirely when the new tuple fits on the same page as the old one. Instead of creating new index entries, PostgreSQL marks the old tuple as HOT_UPDATED and places a HEAP_ONLY tuple on the same page, forming a chain that scans can follow, which reduces WAL traffic, index maintenance, and vacuuming overhead.

Read More
Materialized Tables in Apache Flink
IMAP

April 30, 2026 06:08 AM

Materialized Tables in Apache Flink allows users to define a table directly with its population query, embedding both the schema and the continuous or scheduled refresh logic inside the catalog. This simplifies ETL pipelines by automatically handling job lifecycle, schema evolution, and refreshes.

Read More
Running SQLite in the browser with sql.js and WASM — a practical guide with Google Drive sync
IMAP

April 30, 2026 06:08 AM

A client-side architecture uses SQLite compiled to WebAssembly in the browser, with the database persisted as a single binary file on the user's Google Drive. Compared with IndexedDB or proprietary sync layers, this gives true data portability and privacy: the file can be opened in any SQLite tool, while Drive access is limited via the drive.file scope. Local state is written to localStorage after each mutation, Drive sync is debounced by 10 seconds, and conflict handling prefers Drive as the source of truth.

Read More
Building a High-Scale Real-Time Recommendation Engine with Feature Stores and Redis Observability
IMAP

April 30, 2026 06:08 AM

Real-time recommendation systems now need to combine rich contextual features with sub-100 ms latency at scale, often across billions of interaction records. Feature stores act as the consistency layer between offline training and online serving, reducing training-serving skew, while batch platforms compute expensive features and embeddings. Redis is used for low-latency vector similarity search, candidate retrieval, and caching eligibility filters, keeping request paths fast and efficient.

Read More
Expedia's Service Telemetry Analyzer
IMAP

April 30, 2026 06:08 AM

Expedia's Service Telemetry Analyzer uses LLMs plus Datadog's telemetry data to speed incident investigation and reduce time to know/recover.

Read More
How Linux 7.0 Broke PostgreSQL
IMAP

April 30, 2026 06:08 AM

Linux 7.0 accidentally cut PostgreSQL performance in half because a scheduling change increased how long spinlocks were held during memory page faults, causing massive CPU waste, and switching to huge memory pages fixes the issue.

Read More
Apply here
IMAP

April 30, 2026 06:08 AM

Remi Turpaud

Read More
create your own role
IMAP

April 30, 2026 06:08 AM

Remi Turpaud

Read More
Inc.'s Best Bootstrapped businesses
IMAP

April 30, 2026 06:08 AM

Remi Turpaud

Read More