Does anyone remember the old TV add campaign “Is it live, or is it Memorex”? It’s a great reminder that even though we continually strive to make technology reflect reality, it doesn’t always match reality. At least not…exactly.

And if you haven’t ever read the quote from Alice in Wonderland author Lewis Carroll about the paradox of map scale, it’s right up there with the unavoidable temptation to open Google Earth and zoom into your own backyard until the image breaks down. We have an ongoing debate in our house about whether one of those indistinct white pixels in the grass is actually the family dog.

In our work as mappers, scale is so important, even in this high-resolution digital age. It magnifies the impact that accuracy has on how we use geodata. Whether it’s helping self-driving cars avoid hitting stop signs or updating flood maps to show new 100 yr flood zones, the accuracy of our geodata has a fundamental impact on how we use it.

The accuracy of geodata has a fundamental impact on how we use it.

One of the great promises of our current data rich era is the potential to go back and look at how things have been changing over time. For example, here is a snapshot of the reservoir near my house showing how sedimentation has affected vegetation and water volume in the reservoir over the past 20 years.

Having the data registered in the same geospatial coordinate system (and in the same units!) is a huge benefit for beginning to quantify changes in things like the surface water area or sediment volume, allowing us to calculate changes over time with some degree of confidence.

Here is another series of images showing a look at maps of submerged aquatic vegetation (SAV) in a shallow marine environment from 2000-2014.

Notice any issues? While we intuitively understand that the source data (year of the imagery) is going to be different in each frame, its pretty clear that the method/technique used to derive SAV area was also different each time, and the underlying accuracy of each dataset is going to be different. In this case, the accuracy of each dataset derives from the scale and the resolution/precision of the analysis. Each of these four datasets, from the source imagery through the derived analysis, has an estimated accuracy. And in this case, comparing the data in order to infer something about trends or change is going to be problematic at best. One option would be to go back to the beginning and process all four sets of imagery in the exact same way.

Comparing datasets of different accuracy and precision
is problematic at best.

This issue arises quite frequently when talking about volume calculations using high resolution data such as airborne LiDAR or data from drones. UAV sensor packages with integrated PPK/RTK postprocessing can generate incredibly precise datasets. But these datasets have inherent accuracy that must be accounted for when comparing calculations or surfaces generated from two different points in time. And when trying to minimize accuracy, don’t forget that unless your project area is completely flat, any horizontal error in your data will also translate into vertical error.

As an example, if a topo surface is vertically accurate to one inch (~ 2cm), over an entire acre that translates to about 134 cubic yards of dirt. Or roughly 10 truckloads worth of error. Compare that to another topo surface with the same error, and you very well could be 20 truckloads off.

The implications of accuracy when calculating change are not trivial.

Triangular Irregular Networks (TINs) are commonly used as a data structure to render and manipulate point clouds and other point data. TIN processes that compare two separate surfaces are extremely valuable for large scale (small area) comparisons of topography and for performing volumetric (cut/fill) calculations. But implied is that the area is both small, and the data being compared are of the same quality (e.g., collected using the same instruments/sensors and with the same control, etc.). Hence any inherent error in either of the datasets is often ignored. This may be acceptable for small projects, but it starts to be problematic as the size of the study area increases.

TIN based comparisons also do not lend themselves easily to evaluating or characterizing discrete changes at specific locations. Using a raster approach not only allows for evaluation of individual locations (cells or clusters of cells), but is useful in evaluating spatial patterns to help understand how or why things are changing, such as may be desired in the SAV example above.

Finally, error in airborne geodata collections is often NOT uniformly distributed across the project area, so making universal, systematic adjustments to one dataset with goal of getting it to better “match” with another is just a hot mess waiting to happen. Those kinds of adjustments are the purview of boresight, calibration and post processing, and should not be applied post-facto to geospatial data in order to get two datasets to better align with each other. Field data from ground control and check point/calibration surveys can be used to better characterize and account for error, giving us confidence where we need it.

Making systematic adjustments to data after-the-fact
is just a hot mess waiting to happen.

The bottom line is that we live in an exciting time where big data analytics and artificial intelligence are hot button topics when it comes to geospatial data and remote sensing. Using the vast amount of geodata available to us to analyze changes over time can contribute significantly to achieving efficiencies, cost savings, and improving our knowledge of complex systems. But as geospatial professionals, whether surveyors, engineers, or GISPs, we need be smart about the limits of that data and help those around us get it right.

We need be smart and help those around us get it right.

Benjamin Houston is a licensed professional engineer in New York and the founding officer of GroundPoint Engineering, a federal and NYS certified SDVOB firm.