NVMe: Flash Refuses to Wait in Line
For years the storage industry committed one of its favorite bureaucratic crimes:
it made flash dress up as a hard drive so the software stack would not panic.
This produced the era of the SATA SSD, a device full of aggressively modern silicon forced to stand in the old line, speak the old language, and salute the old customs office.
Then came NVMe.
NVMe was the refusal.
It said:
- the media is solid state
- the transport is PCIe
- the host has many CPU cores
- the controller can process parallel work
- and there is no reason to keep pretending this is a spinning disk with manners
That was the correct decision.
I. The Old Problem Was Not Flash
The old problem was the interface model around it.
AHCI and the SATA software stack were designed in the era of disks that spun, heads that moved, and latencies that made software overhead look modest by comparison.
Once storage became flash, that arrangement started to look ceremonial.
SATA-IO’s own comparison of AHCI and NVMe explains the point cleanly: AHCI was built as an aggregation-point model tied to the history of SATA devices, while NVMe was designed as an endpoint device interface for PCIe storage. The same paper also highlights the scaling difference everyone now quotes for a reason: AHCI’s much narrower command model versus NVMe’s many queue pairs.
| Interface | Design assumption | Queue model | Political meaning |
|---|---|---|---|
AHCI | storage behaves like traditional disk infrastructure | effectively one command queue with up to 32 commands | flash must imitate history |
NVMe | storage is PCIe-attached, parallel, low-latency silicon | paired submission/completion queues, many of them | flash governs itself directly |
The lesson is not merely “NVMe is faster.”
The lesson is that NVMe matches the nature of the device.
II. NVMe Is a Queue Machine
The NVM Express organization describes the architecture plainly: NVMe is based on a paired Submission Queue and Completion Queue mechanism.
Commands are placed by host software into a Submission Queue. Completions are placed by the controller into the associated Completion Queue.
There is always an Admin Queue for administrative commands, and then one or more I/O queues for actual data movement.
That split matters.
It means the device is not just “a block target.” It is a controller with a command regime:
- identify the controller
- identify namespaces
- create and manage queues
- format or sanitize where supported
- run I/O through queue pairs optimized for parallelism
The official NVMe material also emphasizes the point that made architects smile and legacy stacks nervous: queues are mapped to CPU cores and the architecture supports a very large number of I/O queues with deep queue depth.
This is storage designed by people who had noticed the rest of the machine had become parallel years ago.
III. How I/O Actually Looks
At a coarse level, the path looks more like this:
host driver
-> write command into submission queue in host memory
-> ring doorbell register
-> controller executes command
-> controller posts completion entry
-> host driver reaps completion queue
That is a much cleaner arrangement for modern storage than forcing everything through an older “pretend the device is downstream of a historical disk bureaucracy” model.
NVMe is not magic. It is the removal of unnecessary theater.
IV. Namespaces: Because One Device May Need More Than One Republic
One of the most important NVMe ideas is the namespace.
The NVM Express organization defines a namespace as a collection of logical block addresses accessible to host software. A controller may expose multiple namespaces, each identified with an NSID.
This is not just partitioning with better branding.
It matters because namespaces are part of the controller and protocol model itself:
- logical isolation
- multi-tenancy
- distinct namespace IDs
- per-namespace capacity and utilization reporting
- management and attachment operations
The official namespace material even walks through create, delete, attach, detach, and overprovisioning workflows.
That is the point where many users discover NVMe is not merely a “faster SSD connector.”
It is a storage control plane.
| NVMe concept | What it does |
|---|---|
Controller | exposes registers, queue capabilities, admin commands, and namespace access |
Admin Queue | handles management operations such as identify and configuration |
I/O Queues | carry read/write and related data commands |
Namespace | a logical block-addressable storage volume referenced by NSID |
V. Why NVMe Won
NVMe won because it stopped apologizing for being flash on PCIe.
The official NVMe overview calls this out directly:
- no separate storage HBA in the old SATA/SAS sense
- MMIO controller registers
- host memory for submission and completion queues
- small optimized command set
That reduced software friction and matched the actual machine.
When you attach fast non-volatile memory through PCIe, the system no longer wants a translator whose main job is preserving old administrative habits.
It wants a controller interface that assumes:
- low latency matters
- many cores exist
- queues can scale
- the device is intelligent enough to participate
This is why NVMe SSDs felt like a discontinuity rather than an incremental upgrade.
They were not merely “faster SATA.” They were storage escaping a false identity.
VI. How the Machine Identifies the New Regime
NVMe is not a Linux story. It is a storage story, and the BSDs got the memo.
| System | Primary tooling | What it says about the machine |
|---|---|---|
| Linux | nvme-cli | controller identity, namespaces, log pages, formatting, sanitize |
| FreeBSD | nvmecontrol(8), nvd(4), nda(4) | controller commands, namespace management, GEOM/CAM exposure |
| OpenBSD | nvme(4) | storage controller support with OpenBSD’s usual refusal to make a spectacle of it |
| NetBSD | nvme(4), nvmectl(8) | controller and namespace configuration with the usual portable discipline |
On Linux, the administrative structure is visible immediately with nvme-cli:
nvme list
nvme id-ctrl /dev/nvme0
nvme id-ns /dev/nvme0n1
On FreeBSD, the equivalent interrogation is done with nvmecontrol:
nvmecontrol devlist
nvmecontrol identify nvme0
nvmecontrol identify -n 1 nvme0
That is already a conceptual shift from the older mental model.
You are not just asking “what disk is this?”
You are asking:
- what controller is present
- what capabilities does it report
- what namespace am I talking to
- how has the device been provisioned
In other words, you are interrogating the ministry, not merely the platter.
VII. The Real Story (Suppressed)
Officially, NVMe is a technical specification for non-volatile memory attached over PCI Express.
Unofficially, it was the counter-coup against decades of storage paperwork.
The hard drive empire had grown decadent. Everything had to be translated through rituals designed for rotational delay, legacy adapters, and respectable waiting.
Flash arrived with no patience for this.
It wanted:
- parallel queues
- shorter command paths
- direct PCIe attachment
- explicit controller identity
- and the right to maintain several logical territories under one authority
Thus NVMe was born, not as an optimization, but as a purge.
The BSDs understood this early because BSDs still believe the operating system should talk to hardware without sending it through a motivational seminar. FreeBSD exposes NVMe through nvmecontrol and disk layers like nvd and nda. OpenBSD and NetBSD both ship native nvme(4) support. The details differ, but the principle does not: the controller is real, the namespaces are real, and the old disk theater is optional.
The Decree
NVMe matters because it represents the moment storage software finally admitted what the hardware had become.
Not a faster disk.
Not a nicer SATA device.
A PCIe-attached controller managing non-volatile memory through queue pairs, namespaces, and administrative commands that match modern latency and parallelism.
AHCI survived because history is slow to retire its clerks.
NVMe won because flash refused to wait in line behind them.
— Kim Jong Rails, Supreme Leader of the Republic of Derails