·
Huawei’s reported 2026 AI-chip revenue surge changes the China market, but not in the strongest bullish sense. Reuters reported on May 1, 2026 that Huawei expected AI-chip revenue to rise at least 60% to about $12 billion, based on orders already received.[1] That supports the view that domestic demand for non-Nvidia accelerators is now large and urgent.[1][7]
It does not prove that Huawei has solved the harder constraints that determine 2026 share: wafer output, yields, HBM and packaging, software migration, and sustained high-utilization operation at scale.[1][8][9][10] For investors and enterprise buyers, the practical conclusion is narrower: Huawei is now the anchor domestic accelerator supplier in China, especially for sanctioned, state-linked, and inference-heavy deployments, but the evidence still falls short of broad one-for-one displacement of Nvidia in top-end training.[2][3][4][5][6][7]
This report is organized around buyable 2026 reality: verified shipments, active cloud deployment, performance evidence with source quality separated, manufacturing constraints, software maturity, customer adoption, sanctions exposure, and procurement practicality.[1]
The 2026 China market is easier to analyze if products are split into three buckets: verified shipping or cloud-exposed products, verified named deployments, and announced or source-reported ambitions.[3][2][4][1][16][11][12]
| Vendor / product | 2026 status | What is actually verified | What is still mostly ambition or incomplete proof |
|---|---|---|---|
| Huawei Ascend 910/910B in Atlas servers and Huawei Cloud ModelArts | Verified shipping / cloud-exposed | Huawei documents the Atlas 800T A2 training server built on Kunpeng 920 and Ascend 910, and Huawei Cloud ModelArts offers Ascend AI compute with a public buy path and API-visible Ascend flavors. | Public evidence does not show broad named private-sector chip counts, utilization, or delivery lead times. |
| Huawei Ascend 910C and CloudMatrix 384 | Verified shipment and deployment, but supply-constrained | Reuters reported 910C mass shipments beginning in 2025; China Telecom disclosed a live commercial 384-card 910C supernode; Reuters said CloudMatrix 384 was operational on Huawei Cloud. | Private-sector breadth, output volume, and ease of third-party procurement remain unclear. |
| Huawei Ascend 950PR / 950DT / Atlas 950 | Mostly source-reported orders or roadmap | Reuters and the FT reported customer interest, sample activity, and order momentum around 950PR. | Independent performance proof, delivered volume, and broad commercial availability remain thin; 950DT and Atlas 950 are still roadmap items in the reviewed record. |
| Baidu Kunlun P800 | Verified live deployment and cloud/appliance availability | Reuters reported a live 30,000-chip P800 cluster; China Telecom listed Kunlunxin P800 in its domestic-compute service center; Baidu Cloud markets P800-based DeepSeek appliances. | Merchant-card availability and independent benchmark transparency remain limited. |
| Cambricon MLU370-class | Verified commercial shipment | Cambricon’s annual report shows large cloud-side revenue and scaled sector deployments; its MLU370 page describes a current cloud inference card. | Public proof of frontier training clusters or cloud instances at Huawei/Baidu specificity is limited. |
| Hygon DCU 8000 | Verified commercialization; operator-channel availability | Hygon discloses DCU product positioning and software stack; China Telecom says its services can support Hygon DCU. | Reviewed sources do not show a named live cloud deployment or independent cluster benchmark comparable to Huawei or Baidu. |
| Moore Threads, Biren | Insufficient evidence for core buyer table | Serious product and financing activity exists in Reuters and company materials. | The reviewed source set does not provide enough named production deployment or buyable public-cloud evidence for high-confidence 2026 procurement ranking. |
Huawei and Baidu are the only vendors in this source set with clear public evidence of large current training-capable clusters or cloud-scale deployment.[4][5][2][16][11][12] Cambricon and Hygon are commercially real, but with weaker public proof on named large-model cloud deployments.[11][12][13][14][15]
Huawei is clearly beyond the slideware stage. Huawei support documentation describes the Atlas 800T A2 as an AI training server built on Kunpeng 920 and Ascend 910 processors for public cloud, carriers, government, universities, finance, and large-scale data-center clusters.[3] The same documentation shows a 4U system with up to four Kunpeng 920 CPUs and dedicated 200GE interfaces, which is consistent with an actual server product rather than a roadmap placeholder.[17]
Huawei Cloud provides stronger evidence than server brochures because it exposes Ascend compute as a service. ModelArts AI Compute Service offers a public “Buy Now” path and claims support for trillion-parameter training, 1,000-plus-card uninterrupted training, migration tools, and mainstream open-source models.[2] Huawei Cloud API references also expose Ascend NPU flavors, including 910B-related nomenclature, which is direct evidence that at least some Ascend capacity is cloud-visible to outside customers.[19][20][21]
The strongest named Ascend deployment is China Telecom’s Shaoguan cluster. China Telecom said in April 2025 that the world’s first commercial Ascend supernode had formally launched and was already in commercial operation in its Greater Bay Area computing cluster.[4] In August 2025, China Telecom disclosed that this live commercial supernode used 384 Ascend 910C accelerator cards tightly coupled into one computing unit and reported DeepSeek 671B inference throughput of 2,122 tokens per second per card under its stated service targets.[5]
Reuters also reported that Huawei’s CloudMatrix 384, another 384-chip Ascend 910C system, was operational on Huawei Cloud by mid-2025.[18] That matters more than a keynote reveal because it places Ascend 910C in a real service environment, not only in exhibition hardware.[18]
The caution is that Huawei’s next generation is still much less verified than its current deployments. Reuters reported in March 2026 that ByteDance and Alibaba planned to place orders for the newer 950PR, and Reuters and the FT reported that Huawei expected most 2026 orders to center on that chip.[6][1][7] But the public record still verifies 910/910B servers, Huawei Cloud Ascend services, the China Telecom 910C cluster, and CloudMatrix operation more strongly than it verifies delivered 950PR scale.[3][17][2][4][5][18][6][1]
Baidu is the strongest non-Huawei domestic training case because it is both the chip supplier and the cloud operator. Reuters reported in April 2025 that Baidu had successfully switched on a cluster of 30,000 self-developed third-generation P800 Kunlun chips. Reuters also said the cluster could train DeepSeek-like models with hundreds of billions of parameters or let 1,000 customers fine-tune billion-parameter models at the same time, and that Chinese banks and internet companies had adopted the chips.[16] China Telecom’s Greater Bay Area domestic-compute adaptation center later listed Kunlunxin P800 among the domestic AI compute services it could provide to enterprise customers, and Baidu Cloud markets a DeepSeek appliance based on a single-node 8-card P800 configuration.[15][29]
The limitation is evidence quality on comparative performance. Baidu has strong proof of scale and operation, but the reviewed public record still lacks independent benchmark data showing Nvidia-equivalent training efficiency, model quality at equal cost, or power-normalized results.[16]
Cambricon has the clearest evidence of large commercial shipment outside Huawei and Baidu, but the evidence points more to cloud-side inference and enterprise deployment than to frontier training. Its 2025 annual report says cloud product-line revenue reached RMB 6.477 billion, up 455.34% year over year.[11] The same report says Cambricon products achieved scaled deployment in telecom, finance, and internet sectors and passed demanding customer validation on generality, stability, and usability.[11]
Cambricon’s public product evidence is also concrete. Its official MLU370-S4/S8 page describes a high-density cloud inference card built on the Siyuan 370 chip, fabricated on TSMC 7nm, with 72 TFLOPS FP16/BF16, up to 48 GB LPDDR5, and 75 W power.[28] That is useful evidence of a real, shipping inference product, but it is not evidence that Cambricon has matched Nvidia or Huawei in public large-model training clusters.[28][11][16]
Hygon is credible because its official materials describe both hardware continuity and a software migration strategy. Its 2025 annual-report summary says the 8000-series DCU line supports scientific computing, AI, big-data processing, deep-learning training and inference, and large-model scenarios.[12] The same source says Hygon’s DCU uses a GPGPU architecture with a self-developed DTK stack that adapts to multiple APIs and compilers, supports common libraries, and can comprehensively adapt mainstream domestic and foreign large models.[12]
In a December 2025 investor Q&A, Hygon said DCU operator coverage exceeded 99%, was CUDA-compatible, and could cover the range from inference on smaller models to training on models with hundreds of billions of parameters.[13] China Telecom materials show that Hygon compute can be procured through service channels, but the reviewed sources do not show a named live cloud deployment at Huawei or Baidu specificity.[14][15][16][5]
Moore Threads and Biren remain watch-list names rather than core 2026 procurement choices on the public record reviewed here.[22][23][24][25][26][27]
The public performance record for Chinese accelerators in 2026 is still dominated by vendor or operator disclosures rather than independent benchmark programs such as MLPerf.[5][16][28][13] That means buyers should place more weight on deployment form factor, disclosed cluster size, cloud availability, and software fit than on raw TOPS or vendor equivalence claims.[2][5][16][28]
For Huawei, the best semi-independent operational datapoint is China Telecom’s disclosed throughput on a live 384-card 910C supernode: 2,122 tokens per second per card on DeepSeek 671B under stated service-quality targets.[5] Reuters separately reported that Huawei positioned the 910C as roughly H100-class by combining two 910B processors into one package, doubling compute and memory capacity versus 910B.[30] That is useful as a market signal, but it remains a source-reported architectural comparison, not a transparent independent benchmark result.[30]
For Baidu, the strongest evidence is scale rather than benchmark transparency. Reuters verified a live 30,000-chip P800 cluster and quoted Baidu’s claims that it can train DeepSeek-like models with hundreds of billions of parameters.[16] That shows serious deployment capability, but not whether P800 matches Nvidia or Huawei on time-to-train, networking efficiency, software stability under long runs, or output quality per watt.[16]
For Cambricon, the best public numbers are product-sheet specifications. The MLU370-S4/S8 page discloses 72 TFLOPS FP16/BF16, up to 48 GB LPDDR5, and 75 W for a cloud inference card.[28] Those numbers are relevant for inference-card selection, but they are not a substitute for large-model cluster evidence.[28][11]
For Hygon, the public record leans heavily on software compatibility claims rather than external performance proof. Management said operator coverage exceeded 99% and that the stack was CUDA-compatible, but the reviewed sources do not show independent benchmark filings or named production clusters of comparable scale.[13][16][5]
The strongest evidence against an unrestricted Huawei bull case is still supply. Reuters reported in November 2024 that Ascend 910C yields at SMIC’s N+2 process were around 20%, versus the 70% plus usually needed for commercial viability, and that even 910B yields were only around 50%, forcing Huawei to cut production targets and delay orders.[8] Reuters then reported in June 2025 that U.S. Commerce official Jeffrey Kessler assessed Huawei’s Ascend production capacity for 2025 at or below 200,000 chips, with most or all of that output staying inside China.[9]
Those limits matter because Huawei’s visible system strategy is chip-hungry. China Telecom’s live supernode uses 384 Ascend 910C cards, and Reuters said CloudMatrix 384 also incorporates 384 910C chips.[5][18] A small number of showcase clusters can support revenue growth and political signaling without implying broad merchant availability.[9][5][18]
Huawei has signaled progress on memory and packaging, but the proof is still incomplete. Reuters reported in September 2025 that Huawei said it now had proprietary high-bandwidth memory for the 950 family, yet Reuters also noted that Huawei did not disclose who would manufacture the 950 chips and that analysts still pointed to SMIC and Chinese equipment suppliers.[31][32] That suggests a path toward reducing foreign memory dependence, but it is not the same as proven high-yield volume output in 2026.[31][32]
Cambricon faces a different but still material constraint profile. Its MLU370 page cites TSMC 7nm, and its annual report says the company operates a fabless model that depends on outsourced wafer manufacturing and packaging.[28][11] That is a structural sanctions and foundry-access risk even if Cambricon has demonstrated meaningful commercial shipment.[11][28]
Sanctions still cut both ways in China. They restrict Nvidia’s availability and push domestic substitution, but they also constrain domestic vendors’ access to advanced manufacturing tools, memory, and packaging, which is one reason Chinese buyers often face a tradeoff between domestic availability and technical maturity.[30][8][9][7]
Software remains Nvidia’s strongest moat in China. The FT reported in April 2026 that Huawei’s CANN software had drawn criticism from users for making software development more laborious than Nvidia’s CUDA, even as Huawei worked to improve portability.[7] Reuters reinforced that point in March 2026 by reporting that ByteDance and Alibaba were preparing to order the newer 950PR partly because it was more compatible with Nvidia’s CUDA software ecosystem and better suited to inference.[6]
Huawei’s answer is a full-stack platform rather than a merchant-GPU ecosystem. Huawei’s 2024 annual report says Ascend AI had been widely adopted in internet, telecom, and finance, and that more than 50 partners were already working with the upgraded MindIE inference engine.[10] Huawei Cloud’s ModelArts page also emphasizes migration tooling and support for mainstream open-source models, which lowers friction for customers willing to adopt Huawei Cloud or Huawei-managed systems.[2] The practical implication is that Huawei software is strongest when bought as part of an integrated cloud or appliance path, not as a drop-in replacement for CUDA-centered workflows.[7][6][10][2]
Baidu’s software position is strong inside Baidu’s own stack and weaker as a general merchant alternative. Its verified P800 deployments are tied to Baidu Cloud and Baidu-managed appliance offerings, which helps operational integration but reduces portability for buyers who want hardware independence.[16][29][33]
Cambricon and Hygon present two different domestic migration stories. Cambricon has a mature internal stack, with its annual report saying it supports TensorFlow, PyTorch, and Paddle through framework adaptation and its product materials marketing NeuWare and MagicMind.[11][28] Hygon is more explicit on CUDA-style migration: its annual-report summary and investor Q&A say the DTK stack adapts to multiple APIs and compilers, operator coverage exceeds 99%, and the platform is CUDA-compatible.[12][13] Even so, neither Cambricon nor Hygon has public ecosystem gravity close to CUDA, which preserves Nvidia’s operational advantage when buyers can still obtain Nvidia hardware.[11][12][13][7][31]
Huawei’s adoption case is strongest in state-linked and operator-backed environments. The named, operating proof includes China Telecom’s Shaoguan supernode and Huawei Cloud’s exposed Ascend services, while Huawei’s annual report adds broader but less granular adoption claims in internet, telecom, and finance.[4][5][2][10] Reuters nevertheless reported in March 2026 that Huawei had struggled to persuade major private-sector tech firms to adopt 910C in large quantities, with ByteDance and Alibaba instead waiting for 950PR.[6] That is one of the clearest signs that revenue momentum does not automatically equal broad competitive closure versus Nvidia.[6]
Baidu’s adoption proof is unusually concrete for a domestic vendor because the same company controls the chip, the cloud environment, and part of the application stack. Reuters said banks and internet companies had adopted Kunlun P800, and Baidu’s 30,000-chip cluster demonstrates that at least one large-scale training environment is live.[16] Procurement, however, is likely to be concentrated in Baidu-managed cloud or appliance-style offerings rather than open channel card sales.[16][29][15]
Cambricon is easier to underwrite for enterprise inference than for frontier model training. Its annual report documents scaled deployments across telecom, finance, and internet sectors, and large cloud-side revenue suggests real shipment volume.[11] Hygon is plausible for enterprise buyers that value domestic CPU-plus-accelerator integration and CUDA-adjacent migration, but the public record still supports channel availability more strongly than named scaled deployments.[12][13][14][15]
For 2026 procurement, the most realistic domestic buying paths are these:
The likely volume caveat applies across the board: public evidence is strongest for existence and selected deployments, not for broad immediately available inventory at Nvidia-like scale.[9][2][16][11][12]
Huawei’s reported revenue jump materially changes the 2026 market in one respect: it confirms that domestic accelerator demand is large enough for Huawei to be the central non-Nvidia supplier in China.[1][2][5] It does not materially change the market in the stronger sense of proving that Huawei, or Chinese accelerators collectively, have erased Nvidia’s lead in frontier training, software productivity, or operational confidence.[7][6][8][9][31]
| Use case | Best Chinese options in 2026 | Substitution outlook versus Nvidia | Why |
|---|---|---|---|
| State-backed cloud inference, sovereign deployments, telecom/SOE AI services | Huawei Ascend 910B/910C via Huawei Cloud, Atlas, China Telecom supernode paths | Strong substitute | Verified cloud exposure, named commercial deployment, and sanctions-driven procurement preference favor Huawei. |
| Domestic cloud or appliance-based training where buyers accept vendor lock-in | Huawei Ascend; Baidu Kunlun P800 | Partial substitute | Both have verified large-scale deployment evidence, but portability and benchmark transparency remain weaker than Nvidia. |
| Enterprise inference in telecom, finance, internet, and regulated sectors | Cambricon; Huawei; selected Hygon deployments | Practical substitute in many cases | Commercial shipment and sector deployment evidence are strong enough for many inference workloads. |
| Migration-sensitive enterprise AI where CUDA-like compatibility matters | Hygon DCU; Huawei 950PR when available | Partial substitute | Hygon’s DTK/CUDA-compatibility story and Reuters’ reporting on 950PR compatibility improvements matter, but large-scale proof is thinner. |
| Frontier private-sector training for leading model labs | Huawei or Baidu only where domestic procurement is mandatory | Weak substitute | Huawei’s private-sector 910C adoption gap, limited independent performance proof, supply constraints, and CUDA friction leave Nvidia ahead where obtainable. |
| Lowest operational-risk choice for teams already built on CUDA | Nvidia | Nvidia retains lead | Software maturity, training efficiency, ecosystem depth, and operating familiarity still dominate when hardware can be sourced. |
The 2026 buyer conclusion is pragmatic. Chinese accelerators are now credible substitutes for more inference workloads than many foreign investors assumed a year ago, especially when procurement runs through Huawei Cloud, Baidu Cloud, China Telecom, or integrated domestic appliances.[2][5][16][29][11][12] For training, the substitution boundary is tighter. Huawei and Baidu can support serious domestic training, but broad one-for-one replacement of Nvidia for frontier private-sector training is still not verified in the public record reviewed here.[6][16][2][5] Nvidia still holds four defensible advantages in China in 2026: better top-end training performance and efficiency, the dominant CUDA software base, lower operational risk for existing teams, and greater ecosystem depth.[31][7][6]
The bottom line for buyers and investors is that Huawei’s revenue momentum is strategically important, but it changes 2026 market share expectations more than it changes the technical hierarchy.[1][7][6][9][5]
Made with Webhound · Ask questions about this research, build on it, or start your own
34 sources · $20 spent · Ask Webhound about this research, build on it, or start your own
Start free