Spring 2021
Shingled magnetic recording is a low-cost way to increase the areal density of hard disk drives (it is low-cost in the dollar sense because it does not noticeably change the drives’ manufacturing processes).
The analogy that we see repeated in all SMR descriptions is that SMR tracks are overlapped “like shingles on a roof”. Traditionally, HDD density gains have been achieved by reducing both the reader and writer gap sizes. However, we have neared the physical limits of our ability to further reduce the writer gap width below its current size (due to the “superparamagnetic limit”), while the reader gap width can be further reduced. SMR exploits this asymmetry.
One result of partially overlapping tracks is that any write will clobbers some range of “downstream” data. This limits an SMR drive’s support for
Both types of SMR require someone to implement a logic to manage data so that it is not lost. This is often done by maintaining a mapping from the logical blocks to physical blocks in a translation layer (like FTLs on SSDs). At a high level, there are two considerations for implementing an SMR translation solution:
Zones are typically O(hundreds of MiB). 256MiB is a common zone size among drives I have seen/used.
The persistent cache is typically O(ones to tens of GiB).
Logical to physical block mappings can be static or dynamic.
Drive-managed SMR drives typically employ a static mapping, and a persistent cache is used to buffer updates.
Host-managed SMR drives are typically accessed using ZBC/ZAC interface (Zoned Block Commands/Zoned-device ATA Commands). These interfaces give the software access to a few key structures.
One of the readings described an alternate interface called caveat scriptor. In this interface, you can write anywhere on the disk. However, doing so “clobbers” downstream data, rendering the disk’s contents on subsequent tracks unreliable. The disk provides general guidelines so that the impact of writes can be estimated, but ultimately “writer beware”!
If you squint your eyes, the “SMR problem” is just like the “SSD problem”. However, there are several differences that impact the designs that we can expect solutions to employ.
In addition to the SMR hard disk drives that are currently in production (and many DM-SMR drives are widely available for purchase, whether you know it or not), IMR hard disk drives are “on the horizon”. IMR stands for Interlaced Magnetic Recording, and the idea is somewhat similar to SMR, but with important distinctions.
So unlike SMR, there is no notion of a “zone” or a “write pointer”. However, there is no reason we couldn’t retrofit that programming model onto the physical IMR layout.
IMR drives are not yet publicly available. When we do see them, I suspect that they will be managed using an interface similar to the “Zoned” interface that we see with HM-SMR. I posit that there is just too much overhead to developing new software that drive vendors would sacrifice backwards compatibility; they’ve already asked storage developers to scramble to adopt a new interface.
When using the traditional block interface, the FTL has no way of inferring the application’s semantics: WRITE and READ give almost no information. Abstraction is a useful design principle, but it does limit our ability to optimize software across different layers because abstraction hides information. The goal of a multi-stream SSD is to provide applications a mechanism to pass hints to the FTL. These hints are optional, so an application can choose to ignore the stream interface entirely. However, using the stream interface may yield noticeable benefits in terms of write amplification, and as a result, performance.
The stream abstraction gives users a way to “tag” data with an identifier that the SSD can optionally use to optimize its placement and GC decisions. For example, a system may annotate data with similar lifetimes with the same tag. The SSD can then use this tag as a hint that it should co-locate these logical data objects in the same write-erase blocks because they will likely be accessed together or “deleted” together. Recall that the “cost-benefit” algorithm for LFS garbage collection uses data “hotness” as an input. The stream is a way for applications to directly provide the FTL a hint about hotness.
“Zoned Namespaces” (ZNS) are an extension to the NVMe specification. ZNS lets software interact with ZNS SSDs using a similar interface that is used to interact with host-managed SMR drives:
The general idea is that a rich ecosystem of host-managed SMR software already exists; why not take advantage of that ecosystem by creating a similar interface for SSDs?
The overarching goal of ZNS is to give applications a means to reduce their write amplification to almost zero. By exposing the zone interface, applications have complete control over two important FTL responsibilities: LBA->PBA mappings and garbage collection.
Compared to SMR, ZNS is in its early stages, and there are not many examples of ZNS drives “in the wild”. However, I posit that ZNS SSDs will become increasingly more common in the coming years. ZNS SSDs may never be widely available at your typical big box stores, but I expect them to be found in data centers where specialized applications can benefit from their flexibility.