What do you need to start implementing a GIS?

This question came about when I read a post some time ago that laid out a fairly detailed enterprise level open source GIS tech stack.
It was very comprehensive, from data to software to web access and apps. I think the comprehensiveness of it made it overly complex, especially if you were just getting started. I saw a couple comments where people were saying they had a hard time getting all of that laid out as they were trying to get their program off the ground.

The question is then, what do you need to get started? I suppose it depends on where you are working as to what your priorities will be. At the same time, asking that question is how you end up down the rabbit hole of complexity. When you are just getting started, you want to keep it simple.

Let’s look at the basics. If you are doing GIS, there are essentially 2 things you need:

  1. Mapping Software
  2. Data

If you don’t have mapping software, you can’t do any mapping. If you don’t have any data, you don’t have anything to map. That means the web doesn’t matter, apps don’t matter, etc. Those things are all the next step once you have mapping software and data to work with. Let’s look at each.

Mapping Software

There are really two ways to go for mapping software. You either purchase an Esri license, and work with ArcGIS Pro and that software stack, or you go open-source and download QGIS and get started. This is up to your preference. If I’m getting started and want to get off the ground quickly, and/or if budget is an issue, I would go the open-source route. I would even go so far as to say that unless there is an immediately obvious reason to do so, I would not choose to go with Esri licensing. You can go most, or all the way using open source software, including accessing Esri formatted layers, with little to no cost. This argument may shift as you move toward more advanced configurations, but that is a topic for another day. Let’s move on to what you will actually use the software to view.

Data

Actually you need a database

You need to start with a database. When I say that, I mean a full, server installed RDBMS that supports multi-user access and editing controls. My preference for this is PostgreSQL with the PostGIS extension. Your IT staff may have a comment on this, but it is a fairly easy case to make being open-source and having a large, well-supported development community.
This may ruffle some feathers as it isn’t the simplest thing to get started with, but bear with me. If you are starting to implement some sort of GIS, you are likely planning for it to be around a while. On the software side, it is easy to start with one piece of the puzzle and build as you go, QGIS for instance, then some web server for spatial data, web applications, etc. On the data side, once you start to build a data library, and start using that data for projects, it gets much more difficult to change whatever structure you have in the beginning. We’ve all dealt with the dreaded red exclamation point. For the newbies, this means that the data for a particular layer has lost its source and can’t be found. The causes for this are usually that the data has been moved, or renamed. Either way, the more projects you get, using different data layers, the harder it is to face the fact of having to repath data every time you open a project if you reorganize your data library.
That was all a long-winded way of saying that you need to take the time at the beginning to sort out how you want to store your data. The absolute last thing you should do is be basing your data structure on shapefiles. The main reasons, among many, are they are not suited for multi-user access, and the format is very limited with attribute names and data types.
If you aren’t going to use shapefiles, then you need another format, which will likely be a database of some sort. You could use Esri Personal Geodatabases or File Geodatabases, but they each have 2 strikes. First is they are proprietary, and you need Esri licensing to edit the data contained within, and they are again not suitable for multi-user applications.
A quick note about what I mean by multi-user. If you are getting started, you may be the only one managing and editing data. At the same time, you may have someone else who wants to view a set of layers for some work task they have. Even though they are not editing the data, just accessing that data will place a lock in a File or Personal GDB, preventing any editing from taking place. Save yourself a lot of hassle, heartache and headache, and just don’t do it.
Crossing those storage types off the list leaves more pure database options, ranging from small to large. A GeoPackage is an OGC compliant SQL database contained in a single file. It is based on the SQLite format. This could be good for a short term solution, but it has some limitations mainly around user access controls. They just aren’t as robust as a full RDBMS, and again, this is one area where planning for the future is important.
What that leaves, is a full RDBMS, like PostgreSQL, with the PostGIS extension to support spatial data operations and functionality. Just to mention others, Oracle and MS SQL Server each also have spatial data support, but they are both licensed products. The main points for using the full RDBMS are the access controls, and the robustness of a server-based installation. If you are going to prep for a more complex system down the road, you need to be able to control who has access to what data, including the ability to view, edit and possibly create or delete. This control is built into the database and just has to be configured as you add or create more data. The fact the database is installed as part of a server means it is easy to restart or recover from bad things which may happen, while having the data protected for the most part.
Okay, you may ask, you’ve said what is needed, but how does the RDBMS translate to getting started?

How to get started

In this case, you get started, by taking a step back and doing some planning. When you are implementing an RDBMS, you will need to build some structure in ahead of time. To do this, you need to have an idea of the types of data you may have, how you want to have them organizaed, and who and what access you want to grant. Once you’ve thought that through a bit, you can design the database. This is the structure of how the schemas, tables, attributes, etc, will be laid out. You don’t have to actually implement the entire design before you start, but having the design in place will give you a guide as you start to import data into the database and build links between tables.
Now that you have the design, you need the database. So, the next step is to download and install the database. The location would preferably be on a dedicated server which will become your GIS server. You could also put it on a local workstation with plans to move it somewhere eventually, but that is another discussion. Get the database installed. You will need to do some reading to understand how to set up users, roles, and access to different elements of the database. But, that isn’t a huge lift, just a bit of reading. Then, you are ready to go.

What about all the other things?

I can hear you thinking to yourself, “What about all the other things on that list?” Well, it is just a list. Just because it is comprehensive, and even well-thought out, doesn’t mean you need all of it to start. Getting started with anything new can feel overwhelming, especially something like a GIS, where the ideal is that it is very exensive, and comprehensive. Trying to start and include all of the higher complexity additional applications at the very beginning, just makes it moreso.
Keep it simple. You’ll thank me later.

What about the future of GIS?

There is a lot of admittedly relevant discussion about GIS, and geospatial, and what the future of spatial data should look like. Is the current database, data, layer and software structure outdated? Probably, but I don’t think the new structure has really been determined, or developed yet. In the meantime, organizations, whether public or private, still have needs which can only be met with spatial data and analysis. If you start now, and get it into some defined structure in a database, you can work in the present, but also have that data ready to move into whatever future structure is developed.

Now, get started.

Find your mapping software of choice, do a bit of data structure development, and set up a database. Collect some data, do some analysis, and make some maps. Keep it simple, and add complexity as the need arises, and you have capacity to manage it effectively.

Leave a Comment