The Intel Optane Memory (SSD) Preview: 32GB of Kaby Lake Cachingby Billy Tallis on April 24, 2017 12:00 PM EST
- Posted in
- PCIe SSD
- SSD Caching
- 3D XPoint
- Optane Memory
Last week, we took a look at Intel's first product based on their 3D XPoint non-volatile memory technology: the Optane SSD DC P4800X, a record-breaking flagship enterprise SSD. Today Intel launches the first consumer product under the Optane brand: the Intel Optane Memory, a far smaller device with a price that is 20 times cheaper. Despite having "Memory" in its name, this consumer Optane Memory product is not a NVDIMM nor is it in any other way a replacement for DRAM (those products will be coming to the enterprise market next year, even though the obvious name is now taken). Optane Memory also not a suitable replacement for mainstream flash-based SSDs, because Optane Memory is only available in 16GB and 32GB capacities. Instead, Optane Memory is Intel's latest attempt at an old idea that is great in theory but has struggled to catch on in practice: SSD caching.
Optane is Intel's brand name for products based on the 3D XPoint memory technology they co-developed with Micron. 3D XPoint is a new class of non-volatile memory that is not a variant of flash memory, the current mainstream technology for solid state drives. NAND flash memory—be it older planar NAND or newer 3D NAND flash—has fundamental limits to performance and write endurance, and many of the problems get worse as flash is shrunk to higher densities. 3D XPoint memory takes a radically different approach to non-volatile storage, and it makes different tradeoffs between density, performance, endurance and cost. Intel's initial announcement of 3D XPoint memory technology in 2015 came with general order of magnitude comparisons against existing memory technologies (DRAM and flash). Compared to NAND flash, 3D XPoint is supposed to be on the order of 1000x faster with 1000x higher write endurance. Compared to DRAM, 3D XPoint memory is supposed to be about 10x denser, which generally implies it'll be cheaper per GB by about the same amount. Those comparisons were about the raw memory itself and not about the performance of an entire SSD, and they were also projections based on memory that was still more than a year from hitting the market.
3D XPoint memory is not intended or expected to be a complete replacement for flash memory or DRAM in the foreseeable future. It offers substantially lower latency than flash memory but at a much higher price per GB. It still has finite endurance that makes it unsuitable as a drop-in replacement for DRAM without some form of wear-leveling. The natural role for 3D XPoint technology seems to be as a new tier in the memory hierarchy, slotting in between the smaller but faster DRAM and the larger but slower NAND flash. The Optane products released this month are using the first-generation 3D XPoint memory, along with first-generation controllers. Future generations should be able to offer substantial improvements to performance, endurance and capacity, but it's too soon to tell how those characteristics will scale.
The Intel Optane Memory is a M.2 NVMe SSD using 3D XPoint memory instead of NAND flash memory. 3D XPoint allows the Optane Memory to deliver far higher throughput than any flash SSD of equivalent capacity, and lower read latency than a NAND flash SSD of any capacity. The Optane Memory is intended both for OEMs to integrate into new systems and as an aftermarket upgrade for "Optane Memory ready" systems: those that meet the system requirements for Intel's new Optane caching software and have motherboard firmware support for booting from a cached volume. However, the Optane Memory can also be treated as a small and fast NVMe SSD, because all of the work to enable its caching role is performed in software or by the PCH on the motherboard. 32GB is even (barely) enough to be used as a Windows boot drive, though doing so would not be useful for most consumers.
Intel Optane Memory uses a PCIe 3.0 x2 link, while most M.2 PCIe SSDs use the full 4 lanes the connector is capable of. The two-lane link allows the Optane Memory to use the same B and M connector key positions that are used by M.2 SATA SSDs, so there's no immediate visual giveaway that Optane Memory requires PCIe connectivity from the M.2 socket. The Optane Memory is a standard 22x80mm single-sided card but the components don't come close to using the full length. The controller chip is far smaller than a typical NVMe SSD controller, and the Optane Memory includes just one or two single-die packages of 3D XPoint memory. The Optane Memory module has labels on the front and back that contain a copper foil heatspreader layer, positioned to cool the memory rather than the controller. There is no DRAM visible on the drive.
|Intel Optane Memory Specifications|
|Capacity||16 GB||32 GB|
|Form Factor||M.2 2280 B+M key|
|Interface||PCIe 3.0 x2|
|Memory||128Gb 20nm Intel 3D XPoint|
|Sequential Read||900 MB/s||1350 MB/s|
|Sequential Write||145 MB/s||290 MB/s|
|Random Read||190k IOPS||240k IOPS|
|Random Write||35k IOPS||65k IOPS|
|Read Latency||7µs||9 µs|
|Write Latency||18µs||30 µs|
|Active Power||3.5 W||3.5 W|
|Idle Power||1 W||1 W|
|Endurance||182.5 TB||182.5 TB|
The performance specifications of Intel Optane Memory have been revised slightly since the announcement last month, with Intel now providing separate performance specs for the two capacities. Given the PCIe x2 link it's no surprise to see that sequential read speeds are substantially lower than we see from other NVMe SSDs, with 900 MB/s for the 16GB model and 1350 MB/s for the 32GB model. Sequential writes of 145 MB/s and 290 MB/s are far slower than consumer SSDs are usually willing to advertise, but are typical of the actual sustained sequential write speed of a good TLC NAND SSD. Random read throughput of 190k and 240k IOPS is in the ballpark for other NVMe SSDs. Random write throughput of 35k and 65k IOPS are also below the peak speeds advertised my most consumer SSDs, but on par with mainstream TLC and MLC SSDs respectively for actual performance at low queue depths.
Really it's the latency specifications where Optane Memory shines: the read latency of 7µs and 9µs for the 16GB and 32GB respectively are slightly better than even the enterprise Optane SSD DC P4800x, while write latency of 18µs and 30µs are just 2-3 times slower. The read latencies are completely untouchable for flash-based SSDs, but the write latencies can be matched by other NVMe controllers, but only because they cache write operations instead of performing them immediately.
The power consumption and endurance specifications don't look as impressive. 3.5W active power is lower than many M.2 PCIe SSDs and low enough that thermal throttling is unlikely to be a problem. The 1W idle power is unappealing, if not a bit problematic. Many M.2 NVMe SSDs will idle at 1W or more if the system is not using PCIe Active State Power Management and NVMe Power States. The Optane Memory doesn't even support the latter and will apparently draw the full 1W even in a well-tuned laptop. Since these power consumption numbers are typically going to be in addition to the power consumption of a mechanical hard drive, an Optane caching configuration is not going to offer decent power efficiency.
Meanwhile write endurance is rated at the same 100GB/day or 182.5 TB total for both capacities. Even though a stress test could burn through all of that in a week or two, 100GB/day is usually plenty for ordinary consumer use. However, a cache drive will likely experience a higher than normal write load as data and applications will tend to get evicted from the cache only to be pulled back in the next time they are loaded. More importantly, Intel promised that 3D XPoint would have on the order of 1000x the endurance of NAND flash, which should put these drives beyond the write endurance of any other consumer SSDs even after accounting for their small capacity.
Post Your CommentPlease log in or sign up to comment.
View All Comments
YazX_ - Monday, April 24, 2017 - link"Since our Optane Memory sample died after only about a day of testing"
Chaitanya - Monday, April 24, 2017 - linkAnd it is supposed to have endurance rating 21x larger than a conventional NAND SSD.
Sarah Terra - Monday, April 24, 2017 - linkFunny yes, but teething issues aside the random write Performance is several orders of magnitude faster than all existing storage mediums, this is the number one metric I find that plays into system responsiveness, boot times, and overall performance and the most ignored metric by all Meg's to date. They all go for sequential numbers, which don't mean jack except when doing large file copies.
ddriver - Monday, April 24, 2017 - linkSo let's summarize:
1000 times faster than NAND - in reality only about 10x faster in hypetane's few strongest points, 2-6x better in most others, maximum thorough lower than consumer NVME SSDs, intel lied about speed about 200 times LOL. Also from Tom's review, it became apparent that until the cache of comparable enterprise SSDs fills up, they are just as fast as hypetane, which only further solidifes my claim that xpoint is NO BETTER THAN SLC, because that's what those drives use for cache.
1000 times the endurance of flash - in reality like 2-3x better than MLC. Probably on par with SLC at the same production node. Intel liked about 300-500 times.
10 times denser than flash - in reality it looks like density is actually way lower than. 400 gigs in what.. like 14 chips was it? Samsung has planar flash (no 3d) that has more capacity in a single chip.
So now they step forward to offer this "flash killer" as a puny 32 gb "accelerator" which makes barely any to none improvement whatsoever and cannot even make it through one day of testing.
That's quite exciting. I am actually surprised they brought the lowest capacity 960 evo rather than the p600.
Consumer grade software already sees no improvement whatsoever from going sata to nvme. It won't be any different for hypetane. Latency are low queue depth access is good, but that's mostly the controller here, in this aspect NAND SSDs have a tremendous headroom for improvement. Which is what we are most likely going to see in the next generation from enterprise products, obviously it makes zero sense for consumers, regardless of how "excited" them fanboys are to load their gaming machines with terabytes of hypetane.
Last but not least - being exclusive to intel's latest chips is another huge MEH. Hypetane's value is already low enough at the current price and limited capacity, the last thing that will help adoption is having to buy a low value intel platform for it, when ryzen is available and offers double the value of intel offerings.
Drumsticks - Monday, April 24, 2017 - linkYour bias is showing.
1000x -> Harp on it all you want, but that number was for the architecture not the first generation end product. It represents where we can go, not where we are. I'll also note that Toms gave it their editor approved award - "As tested today with mainstream settings, Optane Memory performed as advertised. We observed increased performance with both a hard disk drive and an entry-level NVMe SSD. The value proposition for a hard drive paired with Optane Memory is undeniable. The combination is very powerful, and for many users, a better solution than a larger SSD."
"1000 times the endurance of flash -> You can concede that 3D XPoint density isn't as good as they originally envisioned, but it's still impressive, gen1, and has nowhere to go but up. It's not really worse than other competing drives per drive capacity - this cache supports like 3 DWPD basically. The MX300 750GB only supports like .3 DWPD. 10x better is still good.
10 times denser than flash -> DRAM, not Flash. And it's going to be much denser than DRAM.
Barely any to no improvement -> LOL, did you look at the graphs? Those lines at the bottom and on the left were 500GB and 250GB Sata and NVMe drives getting killed by Optane in a 32GB configuration. 3D XPoint was designed for low queue depth and random performance - i.e. things that actually matter, where it kills its competition. Even sequential throughput, which is far from its design intention, generally outperforms consumer drives.
So, Optane costs, in an enterprise SSD, 2-3x more than other enterprise drives, for record breaking low queue depth throughput that far surpasses its extra cost, while providing 10-80x less latency. In a consumer drive, Optane regularly approaches an order of magnitude faster than consumer drives in only a 32GB configuration.
If Optane is only as fast as SLC, I'd love to understand why the P4800X broke records as pretty much the fastest drive in the world, barring unrealistically high queue depths.
This 32GB cache might be a stopgap, and less compelling of a product in general because of its capacity, but that you could deny the potential that 3D XPoint holds is absolutely laughable. The random performance and low queue depth performance is undeniably better than NAND, and that's where consumer performance matters.
ddriver - Monday, April 24, 2017 - link"I'd love to understand why the P4800X broke records"
Because nobody bothered to make a SLC drive for many many years. The last time there were purely SLC drives on the market it was years ago, with controllers completely outdated compared to contemporary standards.
SLC is so good that today they only use it for cache in MLC and TLC drives. Kinda like what intel is trying to push hypetane as. Which is why you can see SSDs hitting hypetane IOPs with inferior controllers, until they run out of SLC cache space and performance plummets due to direct MLC/TLC access.
I bet my right testicle that with a comparable controller, SLC can do as well and even better than hypetane. SLC PE latencies are in the low hundreds of NANOseconds, which is substantially lower than what we see from hypetane. Endurance at 40 nm is rated at 100k PE cycles, which is 3 times more than what hypetane has to offer. It will probably drop as process node shrinks but still.
"10x better is still good"
Yet the difference between 10x and 1000x is 100x. Like imagine your employer tells you he's gonna pay you 100k a year, and ends up paying you a 1000 bucks instead. Surely not something anyone would object to LOL.
I am not having problems with "10x better". I am having problems with the fact it is 100x less than what they claimed. Did they fail to meet their expectations, or did they simply lie?
I am not denying hypetane's "potential". I merely make note that it is nothing radically better than nand flash that has not been compromised for the sake of profit. xpoint is no better than SLC nand. With the right controller, good old, even ancient and almost forgotten SLC is just as good as intel and micron's overhyped love child. Which is kinda like reinventing the wheel a few thousand years later, just to sell it at a few times what its actually worth.
My bias is showing? Nope, your "intel inside" underpants are ;)
Reflex - Monday, April 24, 2017 - linkSLC has severe limits on density and cost. It's not used because of that. Even at the same capacity as these initial Optane drives it would likely cost considerably more, and as Optane's density increases there is no ability to mitigate that cost with SLC, it would grow linearly with the amount of flash. The primary mitigations already exists: MLC and TLC. Of course those reduce the performance profile far below Optane and decrease it's ability to handle wear. Technically SLC could go with a stacked die approach, as MLC/TLC are doing, however nothing really stops Optane from doing the same making that at best a neutral comparison.
ddriver - Monday, April 24, 2017 - linkSLC is half the density of MLC. Samsung has 2 TB of MLC worth in 4 flash chips. Gotta love 3D stacking. Now employ epic math skills and multiply 4 by 0.5, and you get a full TB of SLC goodness, perfectly doable via 3D stacked nand.
And even if you put 3D stacking aside, which if I am not mistaken the sm961 uses planar MLC, 2 chips on each side for a full 1 TB. Cut that in half, you'd get 512 GB of planar SLC in 4 modules.
Now, I don't claim to be that good in math, but if you can have 512 GB of SLC nand in 4 chips, and it takes 14 for a 400 GB of xpoint, that would make planar SLC OVER 4 times denser than xpoint.
Thus if at planar dies SLC is over 4 times better, stacked xpoint could not possibly not possibly be better than stacked SLC.
Severe limits my ass. The only factor at play here is that SSDs are already faster than needed in 99% of the applications. Thus the industry would rather churn MLC and TLC to maximize the profit per grain of sand being used. The moment hypetane begins to take market share, which is not likely, they can immediately launch SLC enterprise products.
Also, it should be noted that there is still ZERO information about what the xpoint medium actually is. For all we know, it may well be SLC, now wouldn't that be a blast. Intel has made a bunch of claims about it, none of which seemed plausible, and most of which have already turned out to be a lie.
ddriver - Monday, April 24, 2017 - link*multiply 2 by 0.5
Reflex - Monday, April 24, 2017 - linkYou can 3D stack Optane as well. That's a wash. You seem very obsessed with being right, and not with understanding the technology.