CNSG: show

Scalable replay-based replication for fast databases

Dai Qin, Angela Demke Brown, Ashvin Goel

Proceedings of the VLDB Endowment, vol. 10, no. 13, pp. 2025 - 2036., September 2017

Abstract

Primary-backup replication is commonly used for providing fault tolerance in databases. It is performed by replaying the database recovery log on a backup server. Such a scheme raises several challenges for modern, high-throughput multicore databases. It is hard to replay the recovery log concurrently, and so the backup can become the bottleneck. Moreover, with the high transaction rates on the primary, the log transfer can cause network bottlenecks. Both these bottlenecks can significantly slow the primary database. In this paper, we propose using record-replay for replicating fast databases. Our design enables replay to be performed scalably and concurrently, so that the backup performance scales with the primary performance. At the same time, our approach requires only 15-20% of the network bandwidth required by traditional logging, reducing network infrastructure costs significantly.

Manuscript

Bibtex