Last month in AI – September 2025

AI-driven Newsletter

Welcome to the latest edition of Last month in AI!

September turned into AI’s funding frenzy month: record-breaking raises hit the double-digit billions, frontier labs shipped trillion-parameter models, and, because this is 2025, companies started measuring data centers in gigawatts instead of GPUs.
Also, NVIDIA invented a GPU co-processor, and xAI accused OpenAI of stealing trade secrets in court. 

Let’s jump into the biggest stories and breakthroughs that shaped September.

Models

Qwen3-Max

AI

Alibaba dropped Qwen3-Max, a trillion-parameter model that immediately landed among the top coding models globally. Competitive with GPT-5 and Claude 4 on benchmarks, while keeping API pricing transparent and accessible. The Qwen team continues to prove that open weights don’t mean second-tier; this thing can actually code.

Qwen3-Omni

AI

Alongside Max, Alibaba shipped Qwen3-Omni, a real-time multimodal model that handles text, vision, and audio, achieving competitive benchmarks across the board. It’s designed for streaming applications and agentic workflows, with clear API pricing that undercuts most Western alternatives. Open-source momentum keeps accelerating.

z.ai GLM-4.6

AI

Chinese lab z.ai released GLM-4.6, a model that’s reportedly smaller than Qwen3-Max but potentially better in practice. Early reports suggest it’s punching above its weight class, especially on coding tasks. If the numbers hold, this is another data point that open-source is now only months behind, or at parity with, frontier labs.

Claude 4.5 Sonnet

Anthropic rolled out Sonnet 4.5, the next generation of coding models, which are clearly trained for long-running, agentic tasks. It maintains long-term context, takes notes mid-task, and self-validates its outputs for correctness. Like GPT-5, it supports parallel tool calls to speed up workflows (and chew through GPU budgets). This is the model you’d trust to code for 30 hours straight without supervision.

GPT-5 Codex

AI

OpenAI launched GPT-5 Codex, its direct answer to Claude Code and agentic coding workflows. It’s optimized for multi-step reasoning, tool use, and long-context editing sessions. It ships with structured outputs and optional chain-of-thought traces for easier debugging. If you’re building agents, this is the new baseline.

Sora 2

AI

OpenAI finally shipped Sora 2 in late September, and it’s a genuine TikTok competitor. The video quality is stunning, the feed is addictive, and the “cameo” feature lets you add other people to your AI-generated videos – Sam Altman made his cameo public, so expect thousands of AI Sam videos. The only question: can OpenAI afford the GPU bill, or will they need ad revenue like YouTube?

Hardware

NVIDIA Rubin CPX

NVIDIA announced Rubin CPX on September 9, a 128GB GDDR7 GPU designed as a co-processor for the 2026 Rubin HBM lineup. It’s a monolithic die with 30 PFLOPS of NVFP4 performance, 3× the exponent operations of GB300, and four NVENC/NVDEC engines for video workloads. The big idea: split the Context/Pre-fill phase (compute-limited) from the Generation/Decode phase (memory-limited) across heterogeneous racks. Each NVL144 CPX compute tray packs 4 Rubin packages, 2 Vera Arm CPUs, 8 Rubin CPX, and 8 ConnectX-9 NICs (1.6Tbps each = 12.8Tbps per tray). Full racks ship with 144 Rubin + 144 Rubin CPX. Target: by the end of 2026.

NVIDIA + Intel Partnership

NVIDIA announced a $5 billion stake in Intel and a partnership to co-develop custom data-center CPUs and AI infrastructure. Intel will design CPUs that NVIDIA packages with its GPUs, creating tighter integration for AI workloads. It’s a rare alliance between chip giants, signalling that the AI hardware stack is consolidating rapidly.

Huawei Atlas 950/960 SuperPoDs

Huawei unveiled its Atlas 950/960 SuperPoDs with yearly Ascend chip upgrades, positioning them as the “most powerful” AI cluster to rival NVIDIA. The goal: build giant, China-made AI clusters that don’t depend on Western chips. With U.S. export restrictions tightening, Huawei’s compute stack is becoming a serious domestic alternative.

AMD Instinct MI350

AMD’s Instinct MI350 Series (CDNA 4 architecture) is gaining momentum in AI inference. Unveiled at Advancing AI in June, it’s now shipping with ROCm 7.0 software support and compatibility with the Windows environment. AMD is positioning this as an alternative to NVIDIA for enterprises seeking to diversify their GPU supply chains.

Other

Dragon Hatchling: The Missing Link

A team from Poland dropped “The Dragon Hatchling” (BDH) on arXiv September 30, and it might be the most important architecture paper of the year. BDH is a new LLM architecture based on a scale-free biologically inspired network of locally-interacting neuron particles.
The breakthrough: it rivals GPT-2 performance on language and translation tasks (10M to 1B parameters) while being a biologically plausible brain model. Working memory relies entirely on synaptic plasticity, utilising Hebbian learning with spiking neurons, where individual synapses strengthen when processing specific concepts.
The architecture is GPU-friendly, exhibits Transformer-like scaling laws, and has inherent interpretability with sparse, positive activation vectors and demonstrated monosemanticity. If the results are replicated, this is the first architecture to bridge artificial transformers and biological brain models without compromising performance. Code is open-source.

OpenAI + NVIDIA: 10 GW Deal

OpenAI and NVIDIA announced a $100 billion plan to deploy 10 gigawatts of NVIDIA systems, which would account for nearly 1% of the total U.S. power consumption. This is the moment we stopped measuring data centers in GPUs and started measuring them in actual power units. The deal introduces a circular financing model, bundling investment, hardware purchases, and leasing into a single structure.

Stargate Expansion

The Stargate project added five new U.S. data centers to serve 700M weekly users, lifting capacity near 7 GW and $400B in investment. It’s on track to hit 10 GW by 2025, making it one of the most significant AI infrastructure projects in history. Power, not chips, is the new bottleneck.

xAI vs. OpenAI Trade Secret Lawsuit

Elon Musk’s xAI accused OpenAI of trade secret theft, alleging stolen source code and data center strategies via employee poaching. OpenAI rejected the charges.
A messy public fight between two top AI labs – and it’s just getting started.

Meta Vibes

Meta also released Vibes, an AI-generated short-form video feed available in over 40 countries. It arrived just before Sora 2, and the race for AI video supremacy is intensifying. Meta’s betting on network effects; OpenAI’s betting on quality and the cameo feature.

Google Mixboard

Google introduced Mixboard, an AI-powered mood board app in U.S. beta, designed to rival Pinterest and Canva. It’s part of Google’s broader push to integrate generative AI into creative workflows, and early reviews suggest it’s genuinely helpful for visual brainstorming.

Microsoft + Anthropic

Microsoft added Anthropic’s Claude models to Microsoft 365 Copilot, starting with Researcher and Copilot Studio. This is Microsoft’s first major move beyond OpenAI exclusivity, signalling that enterprises want model choice, not vendor lock-in.

OpenAI Acquires Statsig

OpenAI acquired Statsig, a feature flagging and experimentation platform. It’s a sign that OpenAI is building internal tooling to ship faster and run more experiments at scale. Expect more product velocity in the coming months.

Oracle + Meta Cloud Deal

Oracle is in talks with Meta on a $20 billion multi-year deal to supply cloud capacity for AI model training and deployment. If it closes, it will be a significant reshuffle in AI infrastructure, with Oracle positioning itself as a tier-one AI cloud provider.

Fun Corner

Training LLMs is hard. Staying sane while following all AI updates? Even harder.
That’s why we’re wrapping things up with a few memes to keep your loss function low and your dopamine levels high.

Summary

That’s a wrap for this month’s edition!

September was a month of scale and speed: trillion-parameter models, multi-billion-dollar GPU deals, and the first signs of power becoming the new compute currency. As open models continue to catch up and hardware alliances form, the AI race looks less like a sprint and more like a global infrastructure marathon.

That’s it for this month’s AI madness! See you in the next edition – keep learning and keep creating!

Want to learn more?

Explore our blog for detailed guides, technical tutorials and much more!

Also, don’t forget to join Scalac’s Talent Pool!
Check more here https://scalac.io/blog/scala-rust-devops-frontend-careers/

Get the State of

Scala 2025 report

Download now

Authors

Piotr Kosecki
Piotr Kosecki

An AI expert and Scala developer at Scalac, providing ongoing analysis of key developments in artificial intelligence. Scalac's go-to specialist for AI trends and applications. His work bridges the gap between AI research and practical business implementation, making him a trusted voice not only among all the blog posts here, but in the AI community in general. Also, a proud owner of a Czechoslovakian Wolfdog, one of the closest-to-wolf dog breeds that you can legally own.

Latest Blogposts

17.06.2026 / By 

Scalendar – July 2026

Welcome to the July 2026 edition of Scalendar — your monthly guide to Scala events, conferences, meetups, and community happenings from around the world. This month features a strong lineup of events for Scala developers, with a particular focus on programming languages, software engineering, functional programming, and AI. From Scala-specific workshops to major international conferences […]

02.06.2026 / By 

THE SIGNAL: What matters in distributed systems | #3

Header banner for The Signal newsletter by Scalac. Black background with red geometric accents. Text reads: "MAY 2026 / THE SIGNAL / What matters in the distributed systems." Scalac logo in the bottom right.

Here is what matters in distributed systems this month. Oracle proposed removing JVMCI — Amazon pushed back. Anthropic published a Claude Code production postmortem. OpenAI shipped WebSocket Responses API. MCP lands on the JVM.

28.05.2026 / By 

Shipping Faster Doesn’t Mean You Understand What You’ve Shipped

Two abstract figures: one rushing to ship code, one standing confused over what was built — illustration for article on AI-generated code and understanding

Łukasz Marchewka, CTO at Scalac, on the question most engineering teams have stopped asking: does anyone actually understand what we're building?

software product development

Need a successful project?

Estimate project