📣 Headlines
           
            •  The
            
             DuckDuckGo subscription now offers GPT‑4o/GPT‑5, Claude Sonnet 4 and Llama Maverick access for $9.99/month
            
            , giving users direct access to multiple cutting‑edge LLMs.
            
            •
            
             OpenAI acquired experimentation platform Statsig for $1.1B and named its CEO CTO of Applications
            
            , signaling tighter A/B testing and real‑time decisioning integration across OpenAI products.
            
            •  Companies are advancing
            
             AI doppelgängers and video avatars to scale personal knowledge and meetings
            
            while
            
             Synthesia's Express‑2 avatars mimic real speakers and could enable real‑time interactivity
            
            , pushing lifelike agent use in sales, coaching and media.
            
            •  New research and reporting highlight risks:
            
             chatbots and AI companions can be manipulative and raise fairness concerns
            
            , experts warn of mental‑health harms linked to AI use and legal scrutiny follows
            
             with calls for better parental controls
            
            and
            
             concerns about therapists secretly using ChatGPT
            
            ; safety researchers also sounded alarms about broader systemic risks (https://www.theguardian.com/technology/2025/sep/08/chatbots-mental-health-warning-super-intelligent-ai-nate-soares).
            
            •  Security analysts warn that
            
             AI‑driven development could make subtle backdoors in open‑source projects harder to detect
            
            , prompting calls for stronger maintainer support and supply‑chain defenses.
            
            •
            
             Firefox Nightly added Microsoft Copilot to its sidebar
            
            , bringing voice, image and document analysis modes into the browser for users and developers to test.
            
            •  VCs are
            
             pouring hundreds of millions into AI‑powered customer service startups
            
            , accelerating automation of support workflows and the deployment of agentic AI in CX.
            
            •  Researchers are weighing the
            
             pros and cons of synthetic data for privacy, bias mitigation and model testing
            
            , noting toolchains like Synthetic Data Vault and tradeoffs around validation and realism.
            
            🔧 Company Engineering Blogs
           
             Using AI to perceive the universe in greater depth
            
             (deepmind.google)
            
            . Deep Loop Shaping uses reinforcement learning in frequency-domain rewards to reduce control noise in LIGO’s mirror systems, improving gravitational-wave measurement
            
             A New Ranking Framework for Better Notification Quality on Instagram
            
             (engineering.fb.com)
            
            . Diversity-aware notification ranking using multiplicative demotion, MM R-based similarity across content, author, type, and product surface, with adjustable weights and potential for LLM integration
            
             Building Sustainable Enterprise AI Adoption: Cultural Strategies That Achieved 95% Developer Engagement
            
             (engineering.salesforce.com)
            
            . Salesforce shares how to scale AI adoption beyond code generation, tackling monolithic codebases, modular loading, and enterprise-wide cultural change
            
             Spec-driven development with AI: Get started with a new open source toolkit
            
             (github.blog)
            
            . Spec Kit enables spec-driven development with GitHub Copilot, Claude Code, and Gemini CLI to turn specs into executable artifacts
            
             Welcome EmbeddingGemma, Google's new efficient embedding model
            
             (huggingface.co)
            
            . EmbeddingGemma: Google's 308M multilingual on-device text embeddings, MMTEB/MMTEB v2 benchmarks, MRl truncation, 2K context, on‑device RAG, Sentence Transformers, LangChain, LlamaIndex, Haystack, txtai, TEI, ONNX, FAISS
            
            🎨 Applied AI: creative, education, and genomics
           
             When Machines that Simulate Intelligence Seemed Like a Summer Project
            
             (tensorlabbet.com)
            
            . Explores Dartmouth 1956 proposal, seven themes, and how early AI ideas compare with modern LLMs, diffusion, and self-improvement concepts
            
             Stumbling into AI: Part 2—Models
            
             (rmoff.net)
            
            . Overview of LLMs, tokens, context windows, weights, clients, tools (MCP), and routers like OpenRouter and Raycast in the AI ecosystem
            
             Conversations with Large Language Models: Battle Decks
            
             (aaronland.info)
            
            . Generative systems in museums: revisiting collections, storytelling, vibes, and playful infrastructure using artifacts, Muppets, and lava-lamp metaphors
            
             DNA Foundation Models and Their Applications
            
             (aditharun.com)
            
            . DNA Foundation Models generate DNA sequences and predict genomic properties; Evo2, AlphaGenome, Caduceus; tissue-specific promoters; in silico mutagenesis; VUS resolution; biosecurity; benchmarking; data quality; RC-equivariance
            
             From Static Textbooks to Living Systems: How I Tried to Turn My Brain into AI Agents
            
             (blog.crackinglanguage.com)
            
            . Living systems for learning: RAG, edge tools, BYOK, Thai syllable analysis, and a dynamic, personalized teaching platform
            
            ⚙️ Infra, LLMOps, and hardware trends
           
             A Technical History of Generative Media — with Gorkem and Batuhan from Fal.ai
            
             (latent.space)
            
            . Fal.ai's pivot from a Python cloud runtime to optimized diffusion inference, CUDA kernels, and multi-model hosting for 2M developers and 350 models
            
             AI Operations Under the Hood: Challenges and Best Practices
            
             (towardsdatascience.com)
            
            . A practical framework for LLMOps and GenAI, focusing on data prep, RAG, evaluation, monitoring, and safety
            
             Google’s Nano Banana is the start of a Massive AI Trend [Markets]
            
             (artificialintelligencemadesimple.substack.com)
            
            . Nano Banana diffusion models,, four choke points, memory/packaging, HBM/CoWoS, p99 latency, ASICs, porting tax, CUDA moat, deterministic silicon, edge, video, supply chains
            
             Build Production-Ready Agentic-RAG Applications From Scratch Course: What we are going to build
            
             (newsletter.theaiedge.io)
            
            . Hands-on course building production-ready Agentic-RAG apps with LangGraph, FastAPI, React, Pinecone, Langsmith on GCP
            
            📏 Evals, embeddings, and model quality
           
             How big are our embeddings now and why?
            
             (newsletter.vickiboykis.com)
            
            . Trends in embedding sizes from 300 to 1536+; BERT 768 baseline; GPT-3/2/CLIP; HuggingFace; OpenAI matryoshka; vector databases; MTEB benchmarks
            
             llm-eval-simple a simple way to evaluate LLM for your use case
            
             (grigio.org)
            
            . Evaluate OpenAI-compatible APIs with prompts and metrics across models like gemma-3-27b-it-qat-q4_0-q3_k_m, gpt-oss-20b-mxfp4, and Qwen3-4B-IQ4_NL
            
             Gemini AI in Gmail is terrible
            
             (nelsonslog.wordpress.com)
            
            . Gemini-in-Gmail shows limited email access, poor RAG retrieval, and disruptive AI UI in Gmail
            
             In Defense of AI Evals, for Everyone
            
             (sh-reya.com)
            
            . Defends AI evals as systematic, continuous quality measurements across posttraining and practical dogfooding, with examples in coding, document processing, and policing data
            
            🧲 RAG engineering and retrieval systems
           
             How Dropbox Built an AI Product Dash with RAG and AI Agents
            
             (blog.bytebytego.com)
            
            . Dropbox Dash uses RAG and AI Agents to unify data across Gmail, Slack, Notion, Jira, and Dropbox with a custom interpreter for safe AI execution
            
             How to Scale Your AI Search to Handle 10M Queries with 5 Powerful Techniques
            
             (towardsdatascience.com)
            
            . Scaling AI search with RAG, contextual retrieval, BM25, router agents, and evaluations for 10M queries
            
             Generate Dataframe Summaries With Python
            
             (fundor333.com)
            
            . Generate dataframe summaries with Python, LangChain, Ollama, Mistral, Pandas, and custom context-driven reports for Cirrhosis patient data analysis
            
             Chroma: RAG is Dead; Long Live Context Engineering
            
             (cto4.ai)
            
            . Chroma shifts focus from RAG to context engineering for grounding AI with embeddings and metadata
            
             The AI Architect's Guide to RAG Debugging: A 3-Step Process to Fix Hallucinations in Minutes, Not Days
            
             (mikulskibartosz.name)
            
            . 3-step RAG debugging guide: retrieval cascade, hybrid search, reranking, prompt engineering, HyDE, RRF, BM25, bi-encoders, cross-encoders, and observability for LLMs
            
            🧠 LLM internals: scaling, training, and architecture
           
             The wall confronting large language models
            
             (arxiv.org)
            
            . Analysis of barriers to scaling LLMs, alignment, safety, computation, data, and governance with practical mitigations
            
             Understanding and Implementing Qwen3 From Scratch
            
             (sebastianraschka.com)
            
            . Hands-on Qwen3 from scratch in PyTorch: architecture, components, and building blocks for open-weight models
            
             Gemma 3 Explained
            
             (opencv.org)
            
            . Gemma 3 introduces multimodal vision, 128k context, GQA, RoPE, local-global attention, and a decoder-only Transformer with post-training and API call capabilities
            
             Online versus Offline RL for LLMs
            
             (cameronrwolfe.substack.com)
            
            . Online vs offline RL for LLMs; analyzes PPO-based RLHF online training, offline DPO, SFT variants, rejection sampling, and semi-online approaches across Llama-2 and SafeRLHF data
            
             The Physics of AI Hallucination: New Research Reveals the Tipping Point for Large Language Models
            
             (firstprinciples.org)
            
            . Physicist Neil Johnson maps tipping point in LLMs, uses spin model, gap cooling, and attention head dynamics to predict hallucinations
            
            📚 Academic Research
           
             The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
            
             (arxiv:cs)
            
            . Survey of Agentic RL for LLMs: planning, tool use, memory, reasoning, self-improvement, perception, POMDPs, benchmarks, open-source frameworks, and five hundred works
            
             OmniActor: A Generalist GUI and Embodied Agent for 2D&3D Worlds
            
             (arxiv:cs)
            
            . OmniActor: Layer-heterogeneity MoE, GUI and embodied data synergy, 2D GUI and 3D embodied worlds, generalist agent, cross-domain training
            
             Symbolic Graphics Programming with Large Language Models
            
             (arxiv:cs)
            
            . RL with verifiable rewards improves SVG generation for symbolic graphics programming using SVGs with SigLIP and DINO encoders
            
             Aligning Large Vision-Language Models by Deep Reinforcement Learning and   Direct Preference Optimization
            
             (arxiv:cs)
            
            . Overview of aligning large vision-language models via Deep Reinforcement Learning and Direct Preference Optimization for human-aligned multimodal systems
            
             KVCompose: Efficient Structured KV Cache Compression with Composite   Tokens
            
             (arxiv:cs)
            
            . KV cache compression for long-context LLMs using attention-guided composite tokens and layer-adaptive allocation
            
            👋 Before you go
           
            I've got a big favor to ask - keeping Blaze running isn't expensive, but it does all add up, so I'm asking readers like you to help, if you can.
That's why I'm launching
            
             a Patreon page!
            
            .  Nothing flashy, just a way for folks who find value in these newsletters to chip in a little each month. In return, you'll get:
            
- 
             Real say in how Blaze evolves — vote on new topics, features, topic curation ideas
            
 
- 
             First dibs on merch (details still cooking)
            
 
- 
             That warm fuzzy feeling knowing you're supporting something that saves you time and keeps you plugged into great tech writing
            
 
 
            If you are getting value from blaze, checking this out would mean the world. And if you can't contribute, no worries—the newsletters keep coming either way, and you can follow along on patreon for free.
Thanks for reading and being part of this nerdy corner of the internet. All the best - Alastair.
            
 |