CNSG: show

Linearly Compressed Pages: A Main Memory Compression Framework with Low Complexity and Low Latency.

Gennady Pekhimenko, Todd C. Mowry, Onur Mutlu

Student Research Competition at International Conference on Parallel Architectures and Compilation Techniques, Minneapolis, USA, September 2012

Abstract

Main memory is a critical resource in modern systems. Its capacity must be sufficiently provisioned to prevent the target application’s working set from overflowing into the backing store (e.g., hard disk, flash storage) that is slower by orders of magnitude compared to DRAM. Adding more DRAM to meet the increasing main memory demand is ideally not desirable, because this will significantly increase both the system’s cost and the power consumption. One potential way of addressing DRAM capacity problem is by using data compression. Data compression was previously used to increase the effective cache size [2, 3, 5, 6, 7], and DRAM capacity [1, 4], and to decrease bandwidth consumption [6]. One of the primary concerns with data compression is that decompression lies on the critical path of accessing any compressed data. To counter this problem, prior work [2, 6, 7] has proposed specialized compression algorithms that exploit some regular patterns present in in-memory data, and has shown that such specialized algorithms have high compression ratios while incurring relatively lower decompression latencies (1-5 cycles). One of the primary reasons, why applying these works directly to main memory is difficult, is a virtual to physical page mapping. When a virtual page is mapped to a compressed physical page, the virtual page offset of a cache line can be different from the corresponding physical page offset since each physical cache line is expected to be smaller than the virtual cache line. In fact, where a compressed cache line resides in a main memory page is dependent on the sizes of the compressed cache lines that come before it in the page. As a result, accessing a cache line within a compressed page in main memory requires an additional layer of indirection to compute the location of the cache line in main memory. This indirection not only increases complexity and cost, but also can increase the latency of main memory access, and thereby degrade system performance. Our goal in this work is to build a main memory compression framework that neither incurs the latency penalty nor requires power-inefficient hardware. To this end, we propose a new approach to compress pages, which we call Linearly Compressed Pages (LCP). The key idea of LCP is that if all the cache lines within a page are compressed to the same size, then the location of a cache line within a compressed page is simply the product of the index of the cache line and the compressed cache line size. Based on this idea, each compressed page has a target compressed cache line size. Cache lines that cannot be compressed to this target size are called exceptions. All exceptions, along with the metadata required to locate them, are stored separately on the same compressed page. We adapt two previously proposed compression algorithms (Frequent Pattern Compression [2] and BaseDelta-Immediate Compression [6]) to fit the requirements of LCP

Manuscript

Bibtex