How to Optimize SSD Performance for AI & Edge Computing
September 12, 2023
Blog
The rise of artificial intelligence (AI), machine learning (ML), the Internet of Things (IoT), 5G, and other technologies requiring the storage, processing, and analysis of massive amounts of data at high speeds is prompting the movement of data away from the cloud to the edge.
Edge computing takes place on or near the devices themselves. This significantly cuts down latency and processing time, decreases bandwidth requirements, and offers offline access to data where network access may be limited or unavailable.
SSD Market Expansion, Emerging Applications, and Demands
Among data storage types, solid-state drives (SSDs) have been steadily increasing in usage. A study by Market Research Future shows that the market size will reach around USD86.5 billion at a compound annual growth rate (CAGR) of 15.2% by 2030. The growing adoption is attributed to “increasing demand for high-performance storage solutions in data-intensive applications,” especially in industries that require fast and reliable storage.
Gartner, on the other hand, predicts that the SSD market will account for 32% of shipped HDD+SSD exabytes in 2026. In terms of SSD type, the following chart shows how the shipment of Industrial, Enterprise Server, and Enterprise Industrial SSDs will grow between 2022 and 2027.
SSD Market Growth Forecast
|
With regard to interfaces, PC SSDs with PCIe Gen 4 are expected to reach 90%+ in the fourth quarter of this year, with crossovers to Gen 5 expected by mid-2024.
Server SSDs’ growth was depressed in the 2nd half of 2022 up to 2023, mainly due to “anemic” hyperscale, which Serial ATA (SATA) remained resilient due to server OEMs and human-computer interaction (HCI). The transition to PCIe Gen 4 is ongoing, but the transition to Gen 5 is expected to be slow, with a hyperscale-dependent crossover expected by late 2025 to mid‑2026.
As for storage/boot SSDs, a big portion is expected to stay with SATA, with PCIe Gen4 still transitioning, as the slower adoption lags servers by 12 months or more. Gen5 transition is still 2-3 years away and will likely coincide with E3.S and E3.L form factor ramp.
The following graph depicts the shipment growth of PC, server, and storage/boot PCIe NVMe SSDs from 2021 to 2024.
NVMe Attached Rate
|
The Rise of 5G/AI/IoT/Edge Computing: Thermal Considerations
With data moving from centralized (data centers or cloud storage) to distributed (edge storage), it is not uncommon for connected devices and systems to be installed or deployed in unpredictable environments where temperature shifts are extreme, vibrations/shocks are considered part of normal operations, and other difficult operating conditions are part and parcel of daily use.
Amidst these challenging conditions where large amounts of data are constantly being stored, accessed, and processed, can today’s data storage devices handle such rigorous operating environments while maintaining high reliability, high endurance, and sustained performance?
The Need for Wide Temp Solutions in 5G Infrastructure
|
WE BUILD WITH YOU: ATP ODT+SI+ Thermal Solutions and Strategies
As PCIe Gen 4 SSDs continue to ramp up in industrial applications and with the increasing adoption of AI and edge computing enabled by 5G technologies, SSDs should deliver not only higher performance and capacities, but also effective thermal solutions for systems that have limited airflow and are deployed in harsh environments.
For ATP Electronics, a very important challenge is how to deliver high quality SSDs to fulfill the growing demands of AI, edge computing, and other 5G-powered applications.
Signal Integrity Simulation from Design Stage
3D NAND technology is moving from 64 to 176 or 232+ layers. PCIe clock speed has tripled from 533 MT/s to 1.6 GT/s. With various applications taking advantage of the advancements in 3D NAND architecture and increased transfer rates, huge amounts of data are communicated and stored daily into memory devices formed with multiple NAND chips. Engineers need to pay keen attention to the signal interference between chips and data paths.
Now more than ever, maintaining signal integrity with the data path design on the printed circuit board (PCB) is critical. By optimizing each signal trace on the printed circuit board of higher-capacity storage devices, better signal integrity could be observed with its sampling time through SI simulation results. A better design of different form factors could be offered readily.
Signal Integrity Simulation
|
Cadence and Ansys Simulation for Heat Environment Adaptivity
ATP uses the Cadence and Ansys Thermal Simulation to speed up the design process. These are pure hardware simulation tests based on full-speed operation (worst-case scenario). By performing component-level simulation of IR drop analysis (signal integrity) and thermal simulation during product design phase, heat distribution in each PCB layer is shown, indicating which area(s) have the potential risk of accumulation. Based on the results, adjustments are then made in the layout circuits, wire thickness, the quantity and/or position of through-holes, and other variables so that even when SSD temperature is elevated, a sufficient cooling system can be implemented to help distribute the heat. The system’s mechanical design is then checked to determine if housing needs to be customized or adjusted.
A sample case illustrated here is for U.2 SSDs. By using a different mechanical design, we decreased the temperature by 7 degrees Celsius to cool the system and maintain higher sustainable performance.
Adjust/Customize Housing Design
|
Combined with the Thermal Throttling mechanism, we sustained the SSD’s performance dive by optimizing performance with balanced environmental temperature.
Apply the Thermal Throttling Mechanism
|
Heatsink Options for Effective Heat Dissipation
The following graphs show how an 8 mm heatsink design on an NVMe M.2 2280 SSD helped lower temperature, versus PCB only.
Test conditions: Ta: 55°C, airflow: 600 LFM*, read% *LFM: Linear Feet per Minute |
*LFM: Linear Feet per Minute |
Conclusion
As AI/5G/IoT/Edge computing continues to grow, applications powered or driven by these technologies require higher performance and capacity, as well as reliable operation in extreme or harsh temperatures. PCIe Gen 4 is starting to ramp up to be used increasingly in these demanding applications as data moves from centralized locations to the edge, thus delivering lower latency and higher bandwidth. To ensure sustainable performance, it is important to have a Thermal Plan for PCIe Gen 4 SSDs, especially if they are installed in systems with limited airflow and are deployed in harsh environments.
Based on a presentation at Flash Memory Summit 2023