Make static maps digital in a few hours with Machine Learning
Some time ago, I was working on a project that required a map to get some geospatial information on the Lithology (the study of rocks) of the Island of San Miguel in Azores, Portugal. When I asked for the data, to my surprise, I was sent a photocopy of the original geological map in a JPG format.
This email attachment shattered my digital age expectations. I was expecting to open a digital geological map, I hadn’t even considered a photocopied one.
Looking at the JPG file took me back to my 20s, when I was a University student exploring the new world of geospatial analysis and the long process of digitising traditional cartographic maps. I was stressed seeing this JPG file for two reasons:
The image was not geo-localised
It was not a digital categorical map, thus it couldn’t provide useful information (yet)
The manual process to digitise a traditional cartographic map is time-consuming and labour intensive. However, I found a faster and more efficient approach.
To get the project underway, I began by georeferencing the map image using Esri’s ArcGIS in conjunction with classic GIS techniques to provide geospatial context within the image.
Then it was time to begin the manual process of digitising the map. After spending the better part of an hour drawing polygons around the 21 different lithological classes indicated in the original map, I had a ‘Eureka’ moment - I could use Machine Learning (ML) algorithms to classify those boundaries.
Using my experience in building ML algorithms to classify images using Google Earth Engine, I was confident those algorithms were solid enough to produce some useful and reliable results for my map boundaries. Here was my output:
Once I had located and mapped the 21 training areas around the island, I sampled those with the RGB bands from the original image - obviously using digital number values rather than reflectances. The sampled training data was used to train the algorithm with a total of 100 decision trees.
The output of the classification showed a good performance of the methodology applied. A job that manually would have taken a couple of weeks to complete, was complete in about three hours.
Although the results of the newly classified map was 21 classes representing different geological characteristics, I noticed there was also noise produced by single or lonely pixels. To remove this noise, I applied morphological operators which considerably improved the results. See the before and after below.