Beyond CUDA: The impact of DeepSeek V4 running on Ascend on AI hardware architecture

Beyond CUDA: The impact of DeepSeek V4 running on Ascend on AI hardware architecture
15May

Over the past decade, CUDA has been almost equivalent to the underlying infrastructure for AI training. Through the deep coupling of GPUs and software stacks, NVIDIA has established a complete development tool chain and ecosystem, occupying a dominant position in the field of AI training. The situation has changed significantly in the past two years-platforms such as Huawei Ascend, AWS Trainium, and AMD MI series have matured rapidly, each forming available software stacks and hardware solutions. TrendForce data shows that the proportion of local chip purchases in the AI server market has increased to 40% in 2025. IDC predicts that global AI infrastructure investment will reach approximately US$75 billion in 2025. The successful operation of DeepSeek V4 on the Ascend platform further proves that the hardware dependence of AI training is migrating from a single platform to a multi-dimensional ecosystem.

The AI chip ecosystem is moving from unipolar to diverse

The historical status of the CUDA ecosystem is based on two foundations: the parallel computing capabilities of GPU hardware and the integrity of the supporting software stack. Since the release of CUDA 1.0, NVIDIA has continued to invest in building a developer ecosystem, making most AI frameworks and algorithm libraries give priority to supporting CUDA backend. This accumulation of software and hardware collaboration constitutes the threshold for other platforms to enter the field of AI training.

However, this pattern is being broken. Huawei Ascend gradually builds engineering capabilities in training and reasoning scenarios through the CANN (Compute Architecture for Neural Networks) software stack and Ascend series chips; AWS Trainium and Inferentia provide cost-optimized options for AI loads on the cloud; AMD MI300X has differentiated advantages in memory capacity and interconnection bandwidth. The maturity of these platforms makes NVIDIA no longer the only option for hardware selection for AI training.

The operation of DeepSeek V4 on the Ascend platform is a landmark event. As a model with a parameter size of one trillion, its training and reasoning have extremely high requirements on the computing power, video memory, and interconnection bandwidth of the underlying hardware. The ability to complete engineering verification on the Ascend platform shows that the platform's software stack and hardware capabilities have reached the threshold of supporting large-scale model training. For AI infrastructure planners, this means that the evaluation dimension of hardware selection needs to be expanded-in addition to NVIDIA's solutions, platforms such as Ascend and AMD also need to be included in the scope of technical evaluation and cost comparison.

Judging from the coverage of processor platforms, a multi-dimensional ecological pattern has been formed. The ARM architecture covers processor platforms such as TI, NXP, RK, and HiSilicon. The GPU direction covers solutions such as Jetson, Horizon, and Edge10. The FPGA direction includes programmable logic devices such as ZYNQ7020. This parallel situation of multiple architectures puts forward new requirements for hardware design and adaptation.

Diverse AI Chip Ecosystem.jpg

The impact of diverse ecosystems on hardware architecture design

Different AI chips have significant differences at the hardware interface level. In terms of packaging form, GPU acceleration cards usually use PCIe Goldfinger interfaces with auxiliary power supply, while some ASIC chips may use custom on-board packaging or dedicated interconnection interfaces. In terms of interconnection protocols, NVIDIA uses NVLink and NVSwitch to achieve high-speed communication between GPUs, Tencent uses HCCS (Huawei Cache Coherence System) protocol, and AMD relies on Infinity Fabric. These protocols differ in rate, topology and latency characteristics, which directly affect the design of server motherboards and backplanes.

Differences in heat dissipation interfaces are also transmitted to the PCB design level. The air-cooling solution requires reserving the installation space and air duct layout for the radiator fasteners, while the liquid-cooling solution requires planning the location of the cold plate interface and pipeline direction. The TDP (Thermal Design Power Consumption) range of different chips varies greatly. The transition from several hundred watts to kilowatts has a direct impact on the PCB's thermal via density, copper filling strategy and dielectric material temperature resistance level.

The core challenge facing server motherboard design is how to adapt the differentiated needs of different chips within limited board space. AI server motherboards usually adopt a 16-32-storey multilayer board design. The board is made of high-speed materials such as TU933 and ordinary FR-4. The minimum aperture reaches 0.2mm, the inner layer hole-to-line spacing is not less than 7 mils, and the line width and line spacing is controlled at 4.5/2.5mils. The delivery size is usually around 450x380mm, requiring multiple accelerator card slots, high-speed interconnect channels, power distribution networks and cooling interfaces within a limited area.

High-speed signal integrity is a key difficulty in motherboard design. Different interconnect protocols have different signal rate requirements. PCIe Gen5 reaches 32GT/s, while some dedicated interconnects may need to support SerDes rates of 56Gbps or even 112Gbps. The increase in signal rate directly tightens the accuracy requirements for impedance control-impedance deviation needs to be controlled within ±10% under mass production conditions, and ±5% during the sample stage. The back drilling process is used to eliminate the reflection effect of residual piles on high-speed signals, and its depth control accuracy directly affects signal quality. The accuracy control of metallized slots and crimp holes, and the resin filling process of high-density holes are all core technical capabilities in the manufacturing of AI server motherboards.

The need for flexibility in system-level architecture is becoming increasingly prominent. The modular design concept allows core computing modules to be reused and adapted to different chip platforms by replacing the acceleration clip board. The configurable power distribution network can adjust the power supply scheme based on the power consumption characteristics of different chips. Multi-protocol interconnection backplanes need to plan signal routing, impedance matching and crosstalk isolation schemes for different interconnection protocols during the design stage. These system-level design decisions directly affect the long-term evolution capabilities and operating costs of AI infrastructure.

Modular design architecture.jpg

Adaptation strategy for hardware supply chain

The formation of a diversified AI chip ecosystem has put forward higher adaptation requirements for the hardware supply chain. PCB design specifications, manufacturing process requirements and verification standards for different chip platforms are different, and the supply chain requires cross-platform technical accumulation and engineering experience.

Multi-platform PCB design capabilities are the foundation of adaptation work. Different chips have different requirements for PCB layers, plates, impedance control, and heat dissipation design. The design team needs to be familiar with the design specifications of each platform and the transition from the design input stage to the full-process coverage stage of Gerber output, including schematic design, layout planning, signal integrity simulation, power integrity analysis and other aspects, standardized processes and checklists for different platforms need to be established.

Rapid prototyping and verification capabilities can significantly shorten the hardware adaptation cycle for new platforms. In scenarios where chip platforms are switched or new platforms are imported, the hardware team needs to complete design modifications, prototyping, functional verification and reliability testing in a short time. Behind the rapid response capability is the accumulation and continuous iteration of the DFM (Design for Manufacturing) rule base-3.27 million certified materials and 2368 DFM rules, which can avoid a large number of manufacturing risks during the design stage and reduce the number of trial and error iterations.

The value of one-stop service lies in integrating the entire link of PCB design, manufacturing, and assembly, and reducing communication costs and coordination expenses for multi-platform development. Under the traditional model, design, manufacturing, and assembly may be undertaken by different suppliers, and problems in any link require coordination across suppliers. The one-stop model integrates the entire process under a single interface. Feedback from the manufacturing process can be obtained during the design stage, design data can be quickly retrieved during the manufacturing stage, and the bill of materials and quality records of the preceding links are directly used during the assembly stage. In this mode, the modular reuse rate can reach 68%, and the EMC test pass rate reaches 98.5%, which significantly improves the efficiency of multi-platform adaptation.

From the perspective of supply chain, behind the diversified ecological adaptation capabilities is long-term technical accumulation and engineering experience. 27 years of industry experience and the service records of more than 18000 customers show that various technical problems and boundary conditions have been encountered in a large number of projects. These experiences have finally been precipitated into design specifications, process parameters and quality standards. For teams planning multi-platform AI infrastructure, selecting hardware partners with multi-chip adaptation experience and one-stop service capabilities can effectively reduce adaptation costs and shorten the market cycle.

High-speed signal analysis.jpg

Hardware design methodology for diverse ecosystems

Modular design thinking is an effective strategy to deal with diverse ecosystems. The core computing module, power module, interconnection module, and heat dissipation module are independently designed and combined through standardized interfaces. When it is necessary to adapt to a new chip platform, only the corresponding computing module needs to be replaced or adjusted, and other modules can reuse existing designs. This idea can control the adaptation workload of the new platform within a reasonable range and avoid designing from scratch every time you switch.

Design-manufacturing collaboration is the key to avoiding engineering risks. The DFM rule base presupposes the constraints of the manufacturing process to the design stage, and the design team can get real-time feedback when laying out and routing-whether the trace spacing meets the etching process capabilities, whether the via density exceeds the capabilities of the drilling equipment, and whether the copper thickness distribution is uniform. These rules come from statistical analysis of a large amount of production data and verification results of process experiments, and can avoid potential manufacturing defects at the design source.

Future trends point to a higher degree of hardware abstraction and software definition capabilities. The hardware abstraction layer masks differences in underlying hardware through standardized driver interfaces, allowing upper-layer software to access computing resources of different chip platforms in a unified way. Software-defined hardware allows configuration and programming to adjust the behavior patterns of the hardware, such as the reconfigurable characteristics of FPGAs, the configurable parameters of ASICs, etc. The development of these technology directions will further reduce the complexity of adaptation under diverse ecosystems.

Heterogeneous computing requirements for edge AI scenarios are also growing rapidly. Edge AI chips such as DeepEdge10 provide 16TOPS of computing power and support the deployment of inference models on the edge side. Edge scenarios have special requirements for PCB miniaturization, low power consumption, and high reliability, and need to be treated differently from the high computing power and high interconnection bandwidth requirements of data center scenarios. AI industry applications have covered many fields such as large model training and reasoning, autonomous driving, robots, and smart agriculture and animal husbandry. Each field has different hardware needs, and the supply chain needs to establish design capabilities and process reserves for different scenarios.

IDC predicts that global AI infrastructure investment will reach approximately US$120 billion in 2028, and hardware demand in diverse ecosystems will continue to expand. For hardware supply chain planners, laying out multi-platform adaptation capabilities in advance, accumulating cross-platform engineering experience, and establishing a modular design system are reasonable steps to match the pace of this round of investment.

Share To:
Recommended

You May Also Like

How AI Server PCB Manufacturers Break Through 20-40 Layer High-Density PCB Technical Bottlenecks
22 Feb 2026

How AI Server PCB Manufacturers Break Through 20-40 Layer High-Density PCB Technical Bottlenecks

Discover how AI server PCB manufacturers & AI data center PCB manufacturers overcome 20-40 layer HDI PCB bottlenecks with advanced materials, precision drilling & thermal management.

Read More
Data Stays at the Factory: Secure and Efficient! The RK3588 Embedded Board Empowers a New Paradigm for Localized AI Inference
10 Mar 2026

Data Stays at the Factory: Secure and Efficient! The RK3588 Embedded Board Empowers a New Paradigm for Localized AI Inference

Edge AI hardware: RK3588 NPU core board powers localized inference. Data security, millisecond response & 5-8W low power for industrial AI deployment.

Read More
IPDM Drives Mass Production of AI Humanoid Robots: Optimization of the Entire High-Density PCB Design and Manufacturing Process
27 Mar 2026

IPDM Drives Mass Production of AI Humanoid Robots: Optimization of the Entire High-Density PCB Design and Manufacturing Process

KINGBROTHER leverages its IPDM system to optimize the entire process of high-density PCB design, manufacturing, and testing for AI humanoid robots, addressing pain points in mass production...

Read More
Decorative shape
ABOUT OUR COMPANY

Hardware Solution and Manufacturing Service Provider

Founded in 1997 and headquartered in Shenzhen, KINGBROTHER specializes in electronic interconnection technologies and hardware innovation. We focus on electronic product R&D, AI hardware solutions, engineering services, integrated design and manufacturing, and supply chain capabilities to deliver comprehensive PCB manufacturing, IPD (Integrated Product Development), and EMS services.

We are committed to becoming a world-class AI hardware solutions and manufacturing service provider, offering one-stop solutions for AI robots, industrial control, medical devices, new energy, and automotive electronics, helping our customers accelerate innovation and bring products to market faster.

We bridge R&D to mass production with integrated capabilities:
  • Design First
  • Tech Leadership
  • High Reliability
  • Rapid Delivery

Precision-Driven System Design to Accelerate Your Success

We adhere to systematic design as our foundation, offering hardware, software, and industrial design services. With 6 self-owned design centers and a knowledge base including 3.27 million certified materials and 2,368 DFI rules, we significantly reduce design iterations by 60-80% and increase customer project success rates by 35%.

Precision-Driven System Design to Accelerate Your Success

End-to-End Technical Integration for Unbroken Innovation

We have built an integrated technology chain from IC design IPD and PCB to integrated product manufacturing IPI. With 300+ technical solutions and over 2,500,000 product models and project verifications, we achieve closed-loop collaboration and optimization throughout the hardware innovation process.

End-to-End Technical Integration for Unbroken Innovation

Rigorous Engineering for Uncompromising Product Integrity

Through strict QIS quality management systems and full-chain engineering empowerment via DF8, failure analysis, and process control, we eliminate 90% of pad defects and 70% of assembly risks, ensuring product safety for our customers.

Rigorous Engineering for Uncompromising Product Integrity

Agile Manufacturing and Supply Chain for On-Demand Fulfillment

Leveraging 5 IPI smart manufacturing bases and a cloud alliance of over 100 factories, we have established a flexible production system for small-batch, multi-batch needs. Our mature global supply chain ensures quick response and delivery, especially in component procurement.

Agile Manufacturing and Supply Chain for On-Demand Fulfillment
Services

We provide services to 20,000+ clients

The Choice of Dozens of Fortune 500 Companies.
PCB Manufacturing Service

PCB Manufacturing Service

With 29 years of expertise in high-end and specialty PCBs, we deliver reliable and flexible manufacturing solutions. We offer prototyping, quick-turn, and small-to-medium volume PCBs, including multilayer, HDI, high-copper, and rigid-flex boards, backed by a one-stop PCB service that empowers clients across industries, from AI hardware to cutting-edge electronics, to accelerate product innovation and bring ideas to market faster.

View Details
01
Electronic Manufacturing Service

Electronic Manufacturing Service

Reliable AI hardware solutions with full-lifecycle supply chain support. We provide highly reliable electronic manufacturing services. Our integrated PCBA, BOM management, NPI engineering and advanced failure analysis to guarantee reliable performance at every stage.

View Details
02
Electronic Product Design Service

Electronic Product Design Service

We focus on independent design house (IDH) and CAD design, providing AI hardware solutions and covering services such as hardware design, software design, industrial design, and EDA development.

View Details
03
Our Solutions

Full-Stack Electronics Solutions

100% Complete & Professional Solutions: From Design to Manufacturing.

Contact us

Request A Quote