Indexing PostGIS Databases and Spatial Query Performance Evaluations

  • Nguyen, T. T. T


Geodatabases (also commonly known as geospatial databases) are central elements in spatial data infrastructures. The primary advantage over file-based data storage of spatial databases (access via GIS), is that they are structured to encompass existing capabilities of relational database management systems, including support for SQL (Structured Query Language) and the ability to generate complex geospatial queries. PostGIS is an extension to the PostgreSQL object-relational database system which allows GIS objects to be stored in the database. PostGIS comprises functions for basic analysis of GIS objects and more importantly, it also supports the spatial indexing schemes. Indexes are extremely important for large spatial tables, because they allow for quick retrieval of records during query. PostGIS is frequently used during analysis of large data sets if examination of spatial indexes is a particularly essential task. Reported here are results of indexing the PostGIS databases by adopting an R-Tree-over-GiST (Generalized Search Tree) scheme and evaluation of the performance of indexed and un-indexed spatial queries with respect to database size. Experiments were carried out with a huge amount of spatial data obtained from the ESRI website, and using the pgAdmin III tool, which is a comprehensive PostgreSQL database design and management system. Experimental results demonstrate approximately linear increase in spatial index building time as the size of tables increases, but, as the database size increases, processing time is very much greater if the spatial queries are not indexed. However, regardless of the size of the geodatabase, performance of spatial queries is highly sensitive to choice of geometric parameters that the queries refer to.