PMDK FAQ

PMDK is becoming more popular, will it dominate the application development in the area of NVM? Let’s start with resource links, papers, applications and some numbers etc.

Official sites

Research papers

Databases with persistent memory

SQL Server’s Hybrid Buffer Pool
Oracle’s Persistent Memory Accelerator
Exadata with PMEM, an Epic Journey
SAP HANA and Persistent Memory
Under the Hood of an Exadata Transaction, CMU Database Group - Vaccination Database Tech Talks (2021)

Exadata uses about 99% of the PMEM as cache, and 1% as WAL. With some engineering tricks like DDIO off for RDMA/PMEM, NT store to bypass L3 cache, IO directory cache settings, etc.

Numbers

Latency of different devices
```
Mem:  30ns
AEP: 100ns
SSD: 100us
HDD:  10ms
```
See also https://docs.pmem.io/getting-started-guide/what-is-pmdk

Capacity of different devices

Mem:   32GiB
AEP:  512GiB
SSD:    4TiB
HDD:    8TiB

Atomic unit size

Per processor’s design it is 8 bytes or 64 bits. Flush of memory can make sure the 8 bytes are atomic, while more size should be take care by the software.

The block device mode can support sector-size atomic write. Larger size such as page size in file systen may lead to torn pages.

See libpmemobj.

APIs

pmem_map_file

if (flags & PMEM_FILE_CREATE) {                                          
  /*                                                                     
   * Always set length of file to 'len'.                                 
   * (May either extend or truncate existing file.)                      
   */                                                                    
  if (os_ftruncate(fd, (os_off_t)len) != 0) {                            
    ERR("!ftruncate");                                                   
    goto err;                                                            
  }                                                                      
  if ((flags & PMEM_FILE_SPARSE) == 0) {                                 
    if ((errno = os_posix_fallocate(fd, 0,                               
	    (os_off_t)len)) != 0) {                                      
      ERR("!posix_fallocate");                                           
      goto err;                                                          
    }                                                                    
  }                                                                      
} else {

namespace mode

There is a table of diferent modes: https://docs.pmem.io/ndctl-users-guide/managing-namespaces

All kinds of flush ops

CLWB, CLFLUSH, CLFLUSHOPT

lscpu can tell us if the ops are supported or not.

A special instruction for NVM: PCOMMIT is deprecated: https://software.intel.com/en-us/blogs/2016/09/12/deprecate-pcommit-instruction

I will write a blog on these flush primitives latter.

Written on March 22, 2019