
Tech giant Alibaba recently released its own AI reasoning model, QwQ-32B, which sent Alibaba’s Hong Kong-listed shares soaring by 8% and showed that the market still has quite an appetite for new AI models.
The new model doesn’t quite perform as well as current top-of-the leaderboard models: OpenAI’s o3 or Anthropic’s Claude 3.7 Sonnet, but QwQ-32B performs comparably to its domestic rival DeepSeek’s R1, and what it lacks when it comes to “smarts” it more than makes up for in efficiency. Alibaba’s new AI uses much less computing power during operation. It’s dirt cheap to run.
The Problem with Large Language Models
Large Language models are very expensive to operate — OpenAI, ostensibly, spends $100,000 to $700,000 per day, $3 to $21 million per month, on cloud infrastructure that underlines the popular chatbot.
Computing costs are the biggest barrier to AI deployment at scale, and QwQ-32B’s ability to deliver comparable results with significantly less computational overhead is a very compelling value proposition for companies looking to implement AI tech, like Overchat that run specialized AI tools, in the line of Rephrase AI, tools specifically trained to rephrase sentences.
For example, DeepSeek-R1 operates with 671 billion parameters (with 37 billion activated), but QwQ-32B performs similarly in benchmarks with 32 billion parameters and needs only 24 GB of vRAM on a GPU versus 1500 GB of vRAM for R1.
Scott Singer, a visiting scholar at the Carnegie Endowment for International Peace, notes that this development “reflects the broader competitiveness of China’s frontier AI ecosystem.” While the Western competitors are still trying to reach AGI — and seemingly getting nowhere — Chinese researchers have instead focused on optimization, delivering a model that does 90% of what top-shelf options deliver at a fraction of the cost. Is that what the market needs?
Implications for the Western AI Market
For companies like OpenAI, Anthropic, and Google, Alibaba’s approach rings like a wake-up call. Silicon Valley Companies have poured billions of dollars thus far into AGI research, training bigger and better models, seemingly only to hit a plateau of diminishing returns, where more parameters don’t directly translate into better performance.
OpenAI’s GPT 4.5, the company’s biggest and most expensive model to date, for example, boasts higher emotional intelligence but doesn’t perform as well in benchmarks as the company’s own deep reasoning offerings.
At the same time, Sam Altman has backtracked on promises regarding GPT 5, an upcoming and much-anticipated model, initially hinted to be a game changer, but now presented as a router working on top of OpenAI’s other models, like a glorified autoselect. A neat feature, but far from AGI.
The efficiency gains demonstrated by QwQ-32B align with research from Epoch AI, which estimates that AI systems are becoming three times more efficient each year thanks to algorithm improvements.
The Chip Dependency Factor
Despite these efficiency gains, “high-end computing chips remain crucial for advanced AI development,” Singer warns. While these restrictions create difficulties for Chinese AI companies like Alibaba and DeepSeek, they also force Chinese developers to innovate around these limitations.
DeepSeek’s CEO has cited access to chips, rather than money or talent, as their biggest bottleneck. This constraint paradoxically drives the push for greater efficiency in Chinese models. If Chinese companies can maintain competitive performance while requiring fewer high-end chips, they may ultimately develop more commercially viable AI solutions than their Western counterparts who have unrestricted access to computing resources.
Technical Innovation and Efficiency
Alibaba’s QwQ-32B represents a continuation of existing AI trends, where systems consistently improve in performance while becoming more cost-effective to operate. Research from non-profit organization Epoch AI estimates that computing power used for AI training has been increasing by more than 4x annually, while algorithm improvements have made that computing power three times more efficient each year.
QwQ (pronounced like “quill”) joins a new generation of “reasoning models” that some consider a paradigm shift in AI. Where previous improvements came from scaling computing power and training data, this new approach emphasizes giving already-trained models more time to process queries. As the Qwen team explains, “when given time to ponder, to question, and to reflect, the model’s understanding of mathematics and programming blossoms like a flower opening to the sun.”
Notably, Alibaba has released QwQ as “open weight,” meaning the model weights can be downloaded and run locally, even on high-end laptops. Interestingly, a preview version released last November generated considerably less attention than the official launch.
As Singer observes, “The stock market is generally reactive to model releases and not to the trajectory of the technology,” which is expected to continue rapid improvement on both sides of the Pacific. “The Chinese ecosystem has a bunch of players in it, all of whom are putting out models that are very powerful and compelling, and it’s not clear who will emerge, when it’s all said and done, as having the best model.”
For users of multi-AI platforms like Overchat, these developments mean an increasingly diverse ecosystem of AI models with varying specialties and capabilities becoming available to choose from as the global AI race continues to accelerate.