23 July, 2020

Using open spatial data: key things to consider

Spatial data can be expensive and time consuming for your team to capture, clean and ingest into geospatial software. However, open spatial data (OSD) is fostering innovation through free and unrestricted access to geospatial data collected by a community of geospatial analysts. 

These days, businesses commonly supplement their own business information with OSD to help make decisions, tell stories effectively and get buy-in from stakeholders. This means that GIS professionals need to know how to work with OSD, particularly when it comes to integration with existing GIS systems. 

Technological advancements in GIS have led to an increasing number of spatial technologies accepting OSD. However, OSD is not readily available in a format that is compatible with most mapping software. 

The process of preparing the data to be ready for geospatial use is typically considered the most laborious part of the OSD process. If you’re new to working with OSD, we’ve narrowed down the key things you need to consider to help you source, prepare, clean and ingest open spatial data into the mapping software of your choice. 

 

Checklist for validating OSD sets: 

Find: How do I find and verify a credible source? 

To understand the integrity of a data set when sourcing data, a user needs to understand who created it? Why was it created? And Is it fit for the purpose of your project?. If the answer is ‘I don’t know’ or ‘no’, then it’s time to keep looking. 

When finding data, users should start with one known base data source. For example, aerial imagery or road centre networks can be used to help orientate the user with familiar surroundings. 

When an original data source has been found, keep a copy of it to use as a point of truth. This global ID field can help you compare manipulated data against in the future.

Each Australian state has an open data portal that hosts open government data accessible by the public. Here are the links to each state portal:

Western Australia - https://data.wa.gov.au/
Northern Territory - https://data.nt.gov.au/
Queensland - https://www.data.qld.gov.au/
New South Wales - https://data.nsw.gov.au/
Victoria - https://www.data.vic.gov.au/
South Australia - https://data.sa.gov.au/
Tasmania location-based information - https://www.thelist.tas.gov.au/app/content/data

The Australian NationalMap draws in spatial data from across Australia into one central platform here - https://nationalmap.gov.au/.

 

Prepare: How do I prepare this data for clearning? 

If you have chosen an OSD set to use, the next step is to prepare it for cleaning. 

Firstly, look at the coordinate system of the OSD and understand whether they are sourced locally or internationally. Check that the data was captured in the correct coordinate system for the area. 

This may seem mundane, but it prevents users from having larger issues down the track. This is important for spatial accuracy, especially when combining data from various sources and when overlaying less familiar spatial data. The coordinate system can help users understand data that they’ve sourced from a public domain and metadata can ensure they’re keeping track of what data is in use in their project.

Users also need to know the data formats they’re working with and have an idea of where their OSD is going to be ingested. This is important to consider as there are a lot of options available out there and some spatial software is locked. 

The two formats commonly used for OSD are vector data and raster data. Vector data are the lines, polygons and point data. Raster data are digital images made up of pixels that contain information, for example; crop health or land temperature.

 

Clean: What do I need to clean? 

Cleaning data is the most laborious task of this checklist however, it is the most critical piece in the puzzle. 

To clean the data, you should follow these important steps and considerations:

  1. Establishing and clipping an area of interest: This will help your data processing run faster. Imagine this as a way of trimming the fat in your dataset to only include important spatial information that is relevant to your project.
  2. Repair the geometry of the dataset: This includes getting rid of null features, features that add no value, and end bits that don’t have a spatial component - as well as joining lines.
  3. Remove duplicates: when you’re working with data, it’s best practice to keep it clean and tidy. 
  4. Repair topological features that show overlaps and gaps on the surface geometry.
  5. Record metadata information so that you can attribute sources for copyright reasons, no business wants a lawyer knocking at their door!

 

Ingest: How do I ingest my OSD into my preferred mapping software? 

As some spatial technologies are locked, it’s important to understand what format your technology is compatible with. 

There are free tools such as ET Geowizard Tools which can help users manipulate data into a format compatible with your mapping software, without needing to purchase proprietary licensing or install additional software. Here is a link to find out more: https://www.ian-ko.com/ETGeoWizards.html. You can also pay for versions to get you more tools which is significantly cheaper than the higher tier licensing from proprietary vendors.

WMS (Web Map Service), WFS (Web Feature Service), and WCS (Web Coverage Service) are three web service standards from the Open Geospatial Consortium (OGC). All three of these allow users to ingest spatial information in vector, raster or coverage data.

WMS is the most commonly known of the three, and is generally used to deliver map images, but can be used for vector format data. By comparison, WFS holds geographic feature information. The client can have more ability to interact with the data, such as feature editing or performing geographic calculations etc.

The power of being able to connect to data and not store it (which can be costly and in some cases impossible due to hardware limitations) and manipulate, investigate and combine it with your own data/ research is invaluable.

 

What next? 

The data is now prepared, cleaned and packaged in a format that is compatible. So where can you ingest it? There is an abundance of open spatial technologies that are accessible and free to use as well. These include: 

Or check out this list for a ranking by GIS Geography of the top 13 free GIS software: https://gisgeography.com/free-gis-software/

The advantage of using open source spatial technologies is that the software has been developed in an open and collaborative development environment and should include source code. You are then able to change or distribute the software at leisure. 

Open source software has had a bad name in the past due to the usability. Older versions of these softwares have often involved programming and scripting knowledge. With the latest versions have come a big upgrade in the user interface, with many functions and geoprocessing tools packaged up in easy to use menus etc.

When using this open software it is important to assess the results to ensure your outputs are correct. This is the same for proprietary software too.
 

Did you find this article useful? 

If you need training using open source GIS - check out our QGIS training courses here (https://ngis.com.au/Training/Browse-Courses/QGIS-Courses).

Back To News Stories

Connect with us