Today’s Tip
Create indexes for all spatial and attribute tables
Discussion
Today’s tip arises as a result of setting up layers using a Foreign Data Wrapper in Postgres. The layers were initially loading quite slowly, and I realized that I had not added a UniqueID, which I mentioned in TOTD 002, or any sort of an index.
Before we get into it, let’s talk about what an index is. Wikipedia says here, that "An index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure."
In simpler terms, an index is essentially a table that contains the values you wish to index, and assigns each of them either an integer, or a binary number. In some cases, the values are sorted prior to the index values being created, to increase performance. Another table is then created which relates these integer values with the data table rows they correspond to.
Here is a sample data table:
Data Table Row State County City 1 Washington King Seattle 2 Washington Snohomish Everett 3 Washington King Bellevue 4 Washington King Renton 5 Washington Thurston Olympia
Here is the index built on that data table:
Index Index-Data Link County ID Row IndexID King 1 1 1 Snohomish 2 2 2 Thurston 3 3 1 4 1 5 3
When you have an index and perform a search, instead of performing a text search, or searching for a complex number, the search is performed on an integer or a binary number, which is generally faster.
The example above is vastly simplified. When you move into a spatial data, the indexing ability, and thus the benefits, can be much greater. With most spatial indexes, the features are placed into smaller and smaller groupings of cells. Since the index tracks what objects are in a particular cell, it is easy to know what other objects are in the same cell, for comparison purposes. When you perform a spatial search, your search parameters are compared to the index. This comparison can narrow, and thus speed the search by excluding those cells not touched by the search extents.
Most operations you do with a database involve some form of search. This search can take many forms including:
- Finding particular row or attribute for editing
- Identifying particular features for intersection or merging in a spatial function
- Identify function to retrieve attributes for particular feature on map
- Sub-query of features by attribute or by spatial relation to other features.
The common theme among all of these is that they are searching for particular attributes or locations of features. When indexes have been created, some level of this information is already being maintained, whether the list of attributes in a field, or the coordinates of all features in a layer. If they didn’t exist, determining this information would have to be repeated with every query performed.
So, if you deal with databases, or spatial data, do yourselves a favor: Check all of your tables, and make sure that indexes have been created for all spatial fields, and any attributes where you would expect regular searches to be performed. You may very well see a dramatic increase in performance.
Let me know what features or attributes you always make sure to index.