
Last month in AI – September 2025

AI-driven Newsletter
Welcome to the latest edition of Last month in AI!
September turned into AI’s funding frenzy month: record-breaking raises hit the double-digit billions, frontier labs shipped trillion-parameter models, and, because this is 2025, companies started measuring data centers in gigawatts instead of GPUs.
Also, NVIDIA invented a GPU co-processor, and xAI accused OpenAI of stealing trade secrets in court.
Let’s jump into the biggest stories and breakthroughs that shaped September.
Models
Qwen3-Max

Alibaba dropped Qwen3-Max, a trillion-parameter model that immediately landed among the top coding models globally. Competitive with GPT-5 and Claude 4 on benchmarks, while keeping API pricing transparent and accessible. The Qwen team continues to prove that open weights don’t mean second-tier; this thing can actually code.
Qwen3-Omni

Alongside Max, Alibaba shipped Qwen3-Omni, a real-time multimodal model that handles text, vision, and audio, achieving competitive benchmarks across the board. It’s designed for streaming applications and agentic workflows, with clear API pricing that undercuts most Western alternatives. Open-source momentum keeps accelerating.
z.ai GLM-4.6

Chinese lab z.ai released GLM-4.6, a model that’s reportedly smaller than Qwen3-Max but potentially better in practice. Early reports suggest it’s punching above its weight class, especially on coding tasks. If the numbers hold, this is another data point that open-source is now only months behind, or at parity with, frontier labs.
Claude 4.5 Sonnet

Anthropic rolled out Sonnet 4.5, the next generation of coding models, which are clearly trained for long-running, agentic tasks. It maintains long-term context, takes notes mid-task, and self-validates its outputs for correctness. Like GPT-5, it supports parallel tool calls to speed up workflows (and chew through GPU budgets). This is the model you’d trust to code for 30 hours straight without supervision.
GPT-5 Codex

OpenAI launched GPT-5 Codex, its direct answer to Claude Code and agentic coding workflows. It’s optimized for multi-step reasoning, tool use, and long-context editing sessions. It ships with structured outputs and optional chain-of-thought traces for easier debugging. If you’re building agents, this is the new baseline.
Sora 2

OpenAI finally shipped Sora 2 in late September, and it’s a genuine TikTok competitor. The video quality is stunning, the feed is addictive, and the “cameo” feature lets you add other people to your AI-generated videos – Sam Altman made his cameo public, so expect thousands of AI Sam videos. The only question: can OpenAI afford the GPU bill, or will they need ad revenue like YouTube?
Hardware
NVIDIA Rubin CPX

NVIDIA announced Rubin CPX on September 9, a 128GB GDDR7 GPU designed as a co-processor for the 2026 Rubin HBM lineup. It’s a monolithic die with 30 PFLOPS of NVFP4 performance, 3× the exponent operations of GB300, and four NVENC/NVDEC engines for video workloads. The big idea: split the Context/Pre-fill phase (compute-limited) from the Generation/Decode phase (memory-limited) across heterogeneous racks. Each NVL144 CPX compute tray packs 4 Rubin packages, 2 Vera Arm CPUs, 8 Rubin CPX, and 8 ConnectX-9 NICs (1.6Tbps each = 12.8Tbps per tray). Full racks ship with 144 Rubin + 144 Rubin CPX. Target: by the end of 2026.
NVIDIA + Intel Partnership

NVIDIA announced a $5 billion stake in Intel and a partnership to co-develop custom data-center CPUs and AI infrastructure. Intel will design CPUs that NVIDIA packages with its GPUs, creating tighter integration for AI workloads. It’s a rare alliance between chip giants, signalling that the AI hardware stack is consolidating rapidly.
Huawei Atlas 950/960 SuperPoDs

Huawei unveiled its Atlas 950/960 SuperPoDs with yearly Ascend chip upgrades, positioning them as the “most powerful” AI cluster to rival NVIDIA. The goal: build giant, China-made AI clusters that don’t depend on Western chips. With U.S. export restrictions tightening, Huawei’s compute stack is becoming a serious domestic alternative.
AMD Instinct MI350

AMD’s Instinct MI350 Series (CDNA 4 architecture) is gaining momentum in AI inference. Unveiled at Advancing AI in June, it’s now shipping with ROCm 7.0 software support and compatibility with the Windows environment. AMD is positioning this as an alternative to NVIDIA for enterprises seeking to diversify their GPU supply chains.
Other
Dragon Hatchling: The Missing Link

A team from Poland dropped “The Dragon Hatchling” (BDH) on arXiv September 30, and it might be the most important architecture paper of the year. BDH is a new LLM architecture based on a scale-free biologically inspired network of locally-interacting neuron particles.
The breakthrough: it rivals GPT-2 performance on language and translation tasks (10M to 1B parameters) while being a biologically plausible brain model. Working memory relies entirely on synaptic plasticity, utilising Hebbian learning with spiking neurons, where individual synapses strengthen when processing specific concepts.
The architecture is GPU-friendly, exhibits Transformer-like scaling laws, and has inherent interpretability with sparse, positive activation vectors and demonstrated monosemanticity. If the results are replicated, this is the first architecture to bridge artificial transformers and biological brain models without compromising performance. Code is open-source.
OpenAI + NVIDIA: 10 GW Deal

OpenAI and NVIDIA announced a $100 billion plan to deploy 10 gigawatts of NVIDIA systems, which would account for nearly 1% of the total U.S. power consumption. This is the moment we stopped measuring data centers in GPUs and started measuring them in actual power units. The deal introduces a circular financing model, bundling investment, hardware purchases, and leasing into a single structure.
Stargate Expansion

The Stargate project added five new U.S. data centers to serve 700M weekly users, lifting capacity near 7 GW and $400B in investment. It’s on track to hit 10 GW by 2025, making it one of the most significant AI infrastructure projects in history. Power, not chips, is the new bottleneck.
xAI vs. OpenAI Trade Secret Lawsuit
Elon Musk’s xAI accused OpenAI of trade secret theft, alleging stolen source code and data center strategies via employee poaching. OpenAI rejected the charges.
A messy public fight between two top AI labs – and it’s just getting started.
Meta Vibes

Meta also released Vibes, an AI-generated short-form video feed available in over 40 countries. It arrived just before Sora 2, and the race for AI video supremacy is intensifying. Meta’s betting on network effects; OpenAI’s betting on quality and the cameo feature.
Google Mixboard

Google introduced Mixboard, an AI-powered mood board app in U.S. beta, designed to rival Pinterest and Canva. It’s part of Google’s broader push to integrate generative AI into creative workflows, and early reviews suggest it’s genuinely helpful for visual brainstorming.
Microsoft + Anthropic

Microsoft added Anthropic’s Claude models to Microsoft 365 Copilot, starting with Researcher and Copilot Studio. This is Microsoft’s first major move beyond OpenAI exclusivity, signalling that enterprises want model choice, not vendor lock-in.
OpenAI Acquires Statsig

OpenAI acquired Statsig, a feature flagging and experimentation platform. It’s a sign that OpenAI is building internal tooling to ship faster and run more experiments at scale. Expect more product velocity in the coming months.
Oracle + Meta Cloud Deal

Oracle is in talks with Meta on a $20 billion multi-year deal to supply cloud capacity for AI model training and deployment. If it closes, it will be a significant reshuffle in AI infrastructure, with Oracle positioning itself as a tier-one AI cloud provider.
Fun Corner
Training LLMs is hard. Staying sane while following all AI updates? Even harder.
That’s why we’re wrapping things up with a few memes to keep your loss function low and your dopamine levels high.



Summary
That’s a wrap for this month’s edition!
September was a month of scale and speed: trillion-parameter models, multi-billion-dollar GPU deals, and the first signs of power becoming the new compute currency. As open models continue to catch up and hardware alliances form, the AI race looks less like a sprint and more like a global infrastructure marathon.
That’s it for this month’s AI madness! See you in the next edition – keep learning and keep creating!
Want to learn more?
Explore our blog for detailed guides, technical tutorials and much more!
Also, don’t forget to join Scalac’s Talent Pool!
Check more here https://scalac.io/blog/scala-rust-devops-frontend-careers/




