設計工具
Storage

How server SSD endurance workload specifications are like tires

Anthony Constantine | December 2024

When it comes to SSDs used in servers, it’s an understatement to say data loss is bad. To prevent such loss, SSDs are rated by drive writes per day (DPWD) to inform purchasers about what each SSD can support. This approach seems simple, but the workload affects the DWPD value — yet there are so many real-world workloads.

In 2010, JEDEC published “Solid-State Drive (SSD) Endurance Workloads” (JESD219). Its intent was to address this uncertainty by defining a standardized workload so SSD manufacturers could advertise their DWPD based on this workload. The problem was solved, right?

Well yes and no. Having a single reference workload is great as every SSD is baselined to this workload. The challenge is that operating environments and SSDs have evolved over the last 15 years. This evolution has called into question what the endurance workload should look like.
 

Enterprise workloads in 2010 vs. today
 

Most enterprise storage workloads in 2010 were built around HDDs. SSDs were still new to the scene, and operating systems and applications were designed around HDD accesses. The workloads had to account for a notable number of accesses under 4KB, a majority of writes at 4KB and the highest trivial number of writes at 64KB.

The figure below is a snapshot of what we measure from common workloads we see today. As you can see, no accesses are under 4KB. Most writes are more spread out, depending on the workload. Finally, a few of the workloads show large accesses greater than 64KB.

File systems have also changed over time to accommodate SSD accesses. The block size by default for the most common file systems today is 4KB. EXT4, XFS, BTRFS all default to 4KB. When a new filesystem is created on Linux, the block size defaults to 4KB.

Transaction enterprise chart

IU size and WAF
 

My colleague Luca Bert wrote a blog last year that highlights today’s problem. The DRAM to NAND ratio in SSDs is becoming a limitation for SSDs, so higher indirection unit (IU) sizes are being considered for much larger-capacity devices. As part of his analysis, he showed the influence of different IU sizes on the write amplification factor (WAF) across several workloads, and he also noted that JESD219 is an anomaly, especially as higher IU sizes are factored in.
 

SSD and tires?
 

I was talking with a co-worker about endurance workloads, and he made an analogy about tires that led to me doing some digging. In the U.S., tire companies rate their tires based on a test loop against a reference tire defined by the Department of Transportation. But what happens if those tires that are primarily used for normal driving are taken to the racetrack or off road? They likely will not last as long since the tire ratings are based on typical driving conditions. For SSDs, we should consider a similar approach where we define a workload based on more representative operating conditions and not on conditions that are outdated or fringe uses.
 

Where we go from here
 

A difficult part of working on standards is deciding whether a standard should be updated, it should not be touched, or it should be retired. JESD219’s age is showing and it’s becoming less representative of a “common workload.”

But JESD219 still has value in terms of its objective — that is, to have a common workload that the industry can use to gauge SSD endurance. From that perspective, the best course of action is to improve the specification. The following actions would improve this standard for enterprise workloads:

  1. Transactions below 4KB need to become a much lower percentage of the overall payload distribution.
  2. The endurance workload needs to account for and scale based on the IU size.
  3. The maximum payload size needs to scale beyond 64KB.


Implementing these three actions does not mean 100% accuracy for every workload. But given that we need to update the metric to be more representative of today’s workloads, this update will certainly benefit SSD consumers and producers for the next 15 (or more) years.

*Credit Sampath Ratnam and John Maroney for contributing to this blog

Distinguished Member of the Technical Staff in Micron's Storage Business Unit

Anthony Constantine

Anthony Constantine is a Distinguished Member of the Technical Staff in Micron’s SSD Business Unit responsible for Storage Standards at Micron. He authored and contributed to several specifications, serves on the SNIA board, and co-chairs the SNIA SFF TWG.