Surveying the Landscape: An In-Depth Analysis of Spatial Database Workloads

Bogdan Simion, Suprio Ray, Angela Demke Brown

20th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL GIS 2012), Redondo Beach, California, US, November 2012

 

Abstract

Spatial databases are increasingly important for a wide variety of real-world applications, such as land surveying, urban planning, cartography and location-based services. However, spatial database workload properties are not well-understood. For example, it is unknown to what degree one spatial application resembles another in terms of resource demand, or how the demand will change as more concurrent queries (i.e., more users) are added. We show that spatial workloads have a different CPU execution profile than well-studied decision support workloads, as represented by TPC-H. We present a framework to automatically classify spatial queries and characterize spatial workload mixes. We first analyze the resource consumption (i.e., computation and I/O) of a representative set of spatial queries, which are then classified into five distinct categories. Next, we create five homogeneous spatial workloads, each composed of queries from one of these classes. We then vary database-specific parameters (e.g., the buffer pool size) and workload specific parameters (e.g., the query mix), to characterize a workload in terms of CPU utilization and I/O activity trends. We study workloads simulating real-world spatial database applications and show how our framework can classify them and predict resource utilization trends under various settings. This can provide clues to the database administrator regarding which resources are heavily contended and can guide resource upgrades. We further validate our approach by applying it to a much larger dataset, and to a second DBMS.

 

Manuscript

Pdf

 

Bibtex

Bib