The launch of DeepSeek's fourth-generation model family represents one of the most consequential competitive developments Alphabet has yet faced in the current AI cycle, and I do not use that language lightly. Across 142 claims consolidated from early March through early May 2026, a clear picture emerges: a Chinese competitor, forged under the pressure of export controls and supply-chain restriction, has delivered a model family that achieves near-parity on reasoning and coding benchmarks, undercuts Western pricing by 65 to 90 percent, and operates on a one-million-token context window — all while migrating its entire training and inference stack to Huawei Ascend hardware.
This is asymmetric competition of the kind I have seen before, though in different industries. It resembles the moment when a smaller, hungrier steel mill, operating with lower labor costs and newer furnaces, begins to match the output of the established trusts while undercutting them on price. The technology changes; the dynamics rhyme.
Let us examine the layers of this development in order of strategic importance.
Architecture and Scale: The MoE Philosophy
DeepSeek V4 is a Mixture-of-Experts architecture delivered in two primary variants. The V4 Pro model contains 1.6 trillion total parameters, with approximately 49 billion active parameters per inference 11,13. This is more than double the 671 billion total parameters of DeepSeek's prior V3.2 model 10,13. The smaller V4 Flash variant contains 284 billion total parameters with only 13 billion active — a 4.6 percent activation ratio that lays bare the efficiency philosophy at the heart of the architecture 6,9,13.
Both variants support a one-million-token context window, placing them among the largest in the industry 5,6,7,9,13. DeepSeek claims 97 percent recall at that full context length, enabled by an "Engram" long-memory architecture 11. A novel Hybrid Attention mechanism combines Compressed Sparse Attention and Heavily Compressed Attention 5,6,9, while FP4/FP8 mixed-precision training reduces memory requirements by 9.5x to 13.7x versus V3.2 9. The adoption of FP4 precision effectively halves the memory needed to store model weights compared to FP8 alone 9. DeepSeek also developed a custom Muon optimizer to improve training convergence and stability 9.
Let me be precise about the significance: two hundred billion additional total parameters over V3.2 may sound incremental. It is not. The architectural innovations — the Engram memory system, hybrid attention, aggressive quantization, and the homegrown optimizer — collectively signal that DeepSeek has prioritized inference-time efficiency and context-length capability as core differentiators rather than pursuing raw parameter count alone. This is the difference between building a larger blast furnace and improving the Bessemer process. They have chosen process innovation.
Pricing: A Weaponized Cost Structure
The pricing claims constitute the most directly material set of insights for Alphabet, and they warrant close attention.
- V4 Pro: $1.74 per million input tokens, $3.48 per million output tokens 9,13
- V4 Flash: $0.14 per million input tokens (uncached), $0.28 per million output tokens 9,13
- V4 Flash on Microsoft Foundry: $1.03 per million input tokens, $4.12 per million output tokens 8
Against OpenAI's GPT-5.5 pricing of $5 per million input tokens and $30 per million output tokens, DeepSeek V4 Pro represents a 65 percent reduction on inputs and an 88 percent reduction on outputs 9. The V4 Flash variant at $0.28 per million output tokens compares to "two dollars or more" for equivalent Western models 6.
This is not static pricing. Within days of launch, DeepSeek implemented a 75 percent price cut on V4-Pro alongside a 90 percent discount on input cache pricing, with promotional rates in effect until May 5 12,14. This pattern — launch, then immediately slash — signals a deliberate cost-leadership strategy aimed at gaining market share through price disruption 14. DeepSeek explicitly positions V4 Pro as undercutting Gemini 3.1 Pro, GPT-5.5, Claude Opus 4.7, and GPT-5.4 on output token pricing 13.
When I built my steel operations, I understood that whoever could produce at the lowest cost and had the nerve to price accordingly would own the market. DeepSeek's leadership appears to have absorbed the same lesson. At one point during the reporting window, DeepSeek was cited as having a 100x computational efficiency advantage over certain competitors 1. That claim depends heavily on baseline definitions and may have been pre-V4 benchmarking, but it contextualizes how the company can sustain such aggressive pricing.
Performance Benchmarks: Near-Parity with a Knowledge Gap
Multiple corroborated sources indicate that DeepSeek V4 achieves near-parity with frontier models on reasoning and coding benchmarks 2,13. A specific claim with moderate corroboration states that both V4 Flash and V4 Pro perform comparably to GPT-5.4 on coding competition benchmarks 13.
However, there is a consistent and acknowledged gap on knowledge-based benchmarks, with V4 trailing GPT-5.4 and Gemini 3.1 Pro by approximately three to six months 6,13. Multiple independent reports confirm this self-assessment from DeepSeek's own technical report 6,13.
There is a notable contradiction regarding modality that deserves mention. Multiple sources state the V4 models are text-only, supporting neither image, audio, nor video understanding or generation 13. Yet two sources from April 18 describe V4 as "natively multimodal, supporting text, image, and video generation" 11. Given that the text-only claims are more recent and more numerous, the released V4 variants are indeed text-only — a meaningful limitation versus Gemini's native multimodality, and a temporary advantage Alphabet should exploit without delay.
The competitive picture is nuanced but clear: on the most commercially valuable use cases — coding and reasoning — DeepSeek has closed to near-parity. On breadth of knowledge and multimodal capability, it lags. But that lag is measured in months, not years, and the stated intent to close it with the next funding round means this window will not remain open indefinitely.
The Hardware Migration: Huawei Ascend Becomes a Viable Stack
Perhaps the most strategically significant insight for Alphabet is the validation of the Huawei Ascend stack as a viable AI training and inference alternative. This development has the weight of a new railroad line opening in territory previously served only by a single carrier.
The V4 model was originally scheduled for a mid-February 2026 release but experienced repeated delays, ultimately launching in late April 3,11. These delays are consistently attributed to the complexity of migrating from Nvidia CUDA to Huawei CANN/Ascend hardware, requiring extensive code rewriting, debugging, and precision alignment 11. The migration was driven by U.S. export controls on advanced semiconductors 4,6.
DeepSeek trained V4 on Huawei Ascend 950 chips (specifically the 950PR variant) and validated it to run on both Nvidia GPUs and Huawei accelerators 6,9,11. The company reports a 35x inference speed improvement after Ascend optimization 11 and claims 2.87 times single-card performance on Huawei Ascend versus Nvidia H20 11. The compute requirement for V4 was reduced to approximately 27 percent of previous models for equivalent context length 6. Training was conducted on 33 trillion tokens 9.
I am skeptical of cherry-picked benchmarks, and the 35x figure should be treated as a best-case number until independently verified. But the directional signal is unmistakable: export controls, while disruptive, are not stopping Chinese AI progress. They are accelerating the development of a parallel hardware ecosystem. This has multi-year implications for any competitive thesis predicated on semiconductor controls creating durable advantages for American AI labs.
The Open-Weight and Funding Dimensions
The model weights are fully open-source under Apache 2.0 and available on Hugging Face 6,9, accessible via DeepSeek's API, web service, and coming soon to Microsoft Foundry (Azure AI Foundry) 8,9. DeepSeek's positioning of V4 Pro as the largest open-weight model in the market 13 directly challenges the closed-source strategy of Google's Gemini. Open-weight models enable customization, fine-tuning, and local deployment in ways that API-only models cannot match. If DeepSeek combines open-weight availability with near-frontier performance, it could become the default choice for enterprises seeking control over their AI infrastructure — a segment Google is also pursuing through Vertex AI.
However, DeepSeek is not yet a self-sustaining enterprise. The development costs and delays were substantial enough that the company is reportedly seeking its first-ever external funding round with a target valuation of $10 billion, as its internal funds are reportedly insufficient for the next phase of trillion-parameter models, agent development, and multimodal expansion 11. This funding need introduces execution risk into the narrative of a well-capitalized competitor.
Implications for Alphabet
Let me state the conclusions plainly, as I would to my board.
First, the competitive gap is narrowing faster than anticipated. Near-parity on reasoning and coding benchmarks — the most commercially valuable use cases — compresses the timeline for Alphabet's moat. The 1.6-trillion-parameter MoE architecture, Engram memory, and hybrid attention mechanisms represent genuine architectural advances, not mere replication. Google must accelerate its next-generation Gemini development cycle accordingly.
Second, the pricing disruption is structural, not promotional. At 65 to 88 percent below Western competitors, with a 75 percent price cut within days of launch, DeepSeek's cost leadership strategy will compress margins across the AI industry. Google must decide: compete on price, accepting margin compression, or differentiate on capabilities that DeepSeek cannot currently match — native multimodality, search integration, enterprise trust and security, and the integration depth that comes from owning the stack from chip to application. The latter strategy is more sustainable, but it requires demonstrable value justification to customers who will inevitably ask why they should pay five to eight times more.
Third, the Huawei Ascend migration succeeded, resetting the semiconductor-control thesis. Despite significant delays, code rewrites, and precision-alignment challenges, DeepSeek successfully deployed V4 on Huawei Ascend 950 hardware. U.S. export controls are accelerating the development of a Chinese AI hardware ecosystem rather than halting progress. Alphabet should assume that future Chinese AI competitors will not be constrained by Nvidia GPU availability in any meaningful way.
Fourth, funding and modality constraints create a window of opportunity, but a narrowing one. DeepSeek's reported $10 billion funding round and acknowledgment of insufficient internal funds signal that the company is not yet financially self-sustaining. Combined with V4's text-only limitation, this suggests a 6 to 12 month window where Google's multimodal capabilities and financial resources provide a meaningful competitive advantage. However, the Apache 2.0 open-source release and the rapid iteration cycle — from V3.2 to V4 in months — indicate that DeepSeek's trajectory is steep, and well-capitalized competitors should assume continued acceleration.
The question Alphabet must answer is not whether DeepSeek will be a competitor. It is already one. The question is whether Google will respond with the speed, discipline, and strategic clarity that this moment demands — or whether it will treat this as a temporary disturbance in a market it assumes it owns.
History suggests the latter path ends poorly.
Sources
1. Anthropic reveals $30bn run rate and plans to use 3.5GW of new Google AI chips - 2026-04-07
2. #AI #Deepseek is better than #US #AI models like #chatGPT tweakers.net/nieuws/24716... trained on #H... - 2026-04-24
3. r/Stocks Daily Discussion & Technicals Tuesday - Apr 07, 2026 - 2026-04-07
4. 🤖 DeepSeek's new models are so efficient they'll run on a toaster ... by which we mean Huawei's NPUs... - 2026-04-25
5. DeepSeek AI Releases DeepSeek-V4: Compressed Sparse Attention and Heavily Compressed Attention Enabl... - 2026-04-24
6. Anthropic's Export-Control Case Raises Conflict of Interest Concerns | John Lu posted on the topic | LinkedIn - 2026-04-19
7. 2026-05-01 Briefing - alobbs.com - 2026-05-01
8. Introducing DeepSeek V4 Flash and V4 Pro in Microsoft Foundry | Microsoft Community Hub - 2026-04-30
9. DeepSeek's new models offer big inference cost savings - 2026-04-24
10. DeepSeek V4 could turn Huawei's domestically produced NPUs into one of the world's most efficient AI systems - 2026-04-24
11. DeepSeek Reluctantly Opens to External Capital After 3 Years: $10B Valuation Amid Mounting Pressures... - 2026-04-18
12. ⚡️ $GOOG on alert. DeepSeek cuts prices by 75% on the new V4-Pro AI model until May 5... - 2026-04-27
13. DeepSeek previews new AI model that ‘closes the gap’ with frontier models - 2026-04-24
14. DeepSeek Disrupts AI Pricing with 75% Cut | Ashwin Binwani posted on the topic | LinkedIn - 2026-04-27