xAI launched a specialized coding model optimized for speed and cost rather than general intelligence, priced at $0.20 per million input tokens

TLDR AI 2025-08-29

Headlines & Launches

Microsoft releases two in-house models (3 minute read)

Microsoft AI has launched MAI-Voice-1, a speech generation model that can produce a minute of audio in under a second, and MAI-1-preview, a mixture-of-experts foundation model trained on ~15,000 NVIDIA H100 GPUs. The technical capabilities of the models are unclear, but they indicate Microsoft’s intention to reduce its reliance on OpenAI’s model to power its AI offerings.

Meta pushes to release new Llama model before 2026 (1 minute read)

Meta plans to release its next version of Llama before the end of the year. The model, internally known as Llama 4.X or Llama 4.5, will be one of the first projects from Meta Superintelligence Labs. Meta recently restructured the unit into four groups, with teams focused on training, research, product, and infrastructure. Meta first announced the Llama 4 models in April.

Grok Code Fast 1 (4 minute read)

xAI launched a specialized coding model optimized for speed and cost rather than general intelligence, priced at $0.20 per million input tokens versus $18+ from competitors. The model is temporarily free through partnerships with GitHub Copilot, Cursor, and other coding platforms.

Deep Dives & Analysis

Mass Intelligence (10 minute read)

Over a billion people now use AI chatbots regularly. Recent releases have given even unpaid users access to the most powerful models like GPT-5 or nano banana. Intelligence is no longer scarce or expensive, and will only get cheaper and more abundant, so institutions of every size face the chaos of a billion people wielding AI that outperforms humans at intellectual tasks

Building your own CLI Coding Agent with Pydantic-AI (21 minute read)

CLI coding agents can read code, run tests, and update codebases. This article walks readers through how to create their own coding agent by assembling open source tools and using specific development standards for testing, documentation production, code reasoning, and file system operations. While commercial tools are impressive, they’re built for general use cases. Learning how to build a coding agent can provide insight into how systems work and the quality of available tooling.

Engineering & Research

Introducing gpt-realtime and Realtime API updates for production voice agents (8 minute read)

OpenAI’s Realtime API is now generally available. It now supports MCP servers, image inputs, and phone calling, making voice agents more capable. The underlying model, gpt-realtime, is OpenAI’s most advanced speech-to-speech model yet, showing improvements in following complex instructions, calling tools with precision, and producing speech that sounds natural and expressive. The Realtime API processes and generates audio directly through a single model and API, which reduces latency and preserves nuance in speech, producing more natural, expressive responses.

oLLM (GitHub Repo)

oLLM is a lightweight Python library for large-context LLM inference that enables users to run models like Llama-3.1-8B-Instruct on 100k context using around a consumer GPU with 8GB VRAM. It doesn’t use any quantization, only fp16 precision.

R-Zero: Self-Evolving Reasoning LLM from Zero Data (1 minute read)

Self-evolving large language models are models that can autonomously generate, refine, and learn from their own experiences. Existing methods for training such models still rely heavily on vast human-curated tasks and labels. This poses a fundamental bottleneck to advancing AI systems toward capabilities beyond human intelligence. R-Zero is a fully autonomous framework that generates its own training data from scratch. It substantially improves reasoning capability across different backbone models.

Miscellaneous

Xcode 26 beta 7 adds support for Claude Sonnet and ChatGPT 5 (1 minute read)

Claude Sonnet 4 is now available in Xcode. It can be accessed in the Intelligence setting panel. An existing paid Claude account is required to use the integration. Users can now start a new conversation with either GPT-4.1 or GPT-5, with GPT-5 set as the default, when using ChatGPT in Xcode.

The 100 Most Influential People in AI (20 minute read)

TIME’s third annual list of the 100 most influential people in AI includes familiar leaders like Sam Altman, Elon Musk, and Jensen Huang alongside newcomers like DeepSeek CEO Liang Wenfeng and Pope Leo XIV.

Quick Links

Anthropic users face a new choice – opt out or share your data for AI training (2 minute read)

Anthropic reversed its no-training policy for consumer data, now requiring users to actively choose whether Claude can learn from their conversations by September 28.

New AI-powered live translation and language learning tools in Google Translate (4 minute read)

Google Translate now offers real-time conversation translation in over 70 languages, enhancing live interactions with advanced AI voice and speech recognition models.