The first part of this article was mostly me whining about the problems associated with mapping change. This part focuses on solutions.
One of the fundamental principles introduced earlier was the idea that we need to account for “allowable error” in the data. We can’t just ignore it and hope for the best.
The figure below demonstrates how two different datasets measuring the same flat surface can not only give different results, but they can both be “accurate”. Dataset #2 is clearly “more accurate” but that doesn’t mean Dataset #1 is wrong. Note that NEITHER dataset offers the “true” location of the surface. Each provides an accurate measurement within the specifications of the respective dataset.
If these two datasets were collected 10 years apart and we simply compared them directly, the assumption would be that the “measured difference” represents some sort of change.
Which, of course…would be wrong.
Whether the measurements are made at the same time by different tools, or are separated by months or years, the surface is still unchanged. So how do we reconcile this? Especially if we know that at least some change has occurred?
Unfortunately for us, and for the well–wishing uninitiated, we can only see change if it’s larger than the accuracy limits of the data.
Any measured differences between the two surfaces shown above that is less than the allowable error in the data cannot be distinguished from the possible error. So those differences must be considered part of the normal and acceptable variability in each dataset, and not due to actual changes in the landscape. This is true even if we KNOW there have been changes in the landscape. In order to say for certain that any measured differences are due to change, those differences must exceed the allowable error.
Horizontal Error Impacts Vertical Error
And of course, not all terrain is completely flat! Slight horizontal shifts between two elevation surface models will result in apparent vertical differences, even if the original vertical error were theoretically zero. So horizontal error, even within acceptable ranges, can add to the vertical error. The additional vertical error can be calculated using basic trig:
y = x (tan α)
x = allowable horizontal error,
y = calculated potential vertical error, and
α = slope in radians
In flat areas, where α is really small, the added vertical error is effectively zero. In steep terrain, the impact of horizontal error on vertical error is potentially much more problematic.
So where do we start? The latest ASPRS accuracy standards (ASPRS 2014) align the old National Map Accuracy Standard (NMAS 1947) to new vertical and horizontal accuracy classes. They also establish a relationship between the resolution of the data (pixel size, or cell size) and horizontal accuracy. For example, the BEST horizontal accuracy that can be obtained on any image (or raster) is equivalent to one pixel, or one cell. Anything smaller than that (sub-pixel) is impossible to measure. As a general rule, the practical limit of horizontal accuracy is more like two pixels.
For both horizontal and vertical accuracy, the ASPRS standards establish a 95% confidence interval for each accuracy class. This effectively means that in order to meet the standard for a particular accuracy class, 95% of the measured error values need to be smaller than limits established for that class. Any error measurements up to that threshold (and a few even larger) are considered acceptable. This propagates in either the positive or the negative direction, which is why when comparing two datasets, we have to add their respective error together to fully appreciate the possible or “allowable” error in the comparison. One might err high, and the other low. Both would still be acceptable.
The Root Mean Square Error (RMSE), by contrast, is more analogous to the “average” error, and is not a threshold value. Contrary to many misconceptions, RMSE is NOT an upper limit, or a “not to exceed” value for error, although it is commonly misused as such.
Using these standards, we can combine the anticipated “allowable” vertical error from two different datasets and establish a minimum threshold for detecting any changes between them.
A Case Study
GroundPoint recent conducted a change detection analysis for a project site in Western New York State. The area had recent 2020 LiDAR, along with legacy LIDAR data dating back to 2008. The purpose of the analysis was to identify locations of sediment sources and observable erosional changes that occurred over the past 12 years. Table 1 shows how the standards described above were used to assess allowable error for both datasets.
Comparison of the ASPRS Accuracy Classes applied to two different datasets.
Note that the horizontal accuracy, for example, cannot be higher than the resolution of the data. If information were available documenting the horizontal accuracy of the LiDAR data for each raw point cloud, that could be used to further refine the horizontal accuracy of the change detection, and the impact of horizontal error on vertical error, etc. Lacking detailed metadata or accuracy assessment reports, we assume the data has the BEST POSSIBLE accuracy at the resolution provided.
Building a relatively straight forward model in ArcGIS, Total Allowable Error can be calculated on a cell-by-cell basis and compared with measured differences between two elevation models.
Example ArcGIS Model for calculating differences between elevation datasets, adjusting for the total
allowable error. The result is a raster where only cells with differences greater than the total allowable
error are preserved.
Any cells with measured elevation differences (either positive or negative) smaller than the total allowable error are removed, and the remaining cell values adjusted accordingly. The result is a raster where:
- Cells that experienced measured differences less than the Total Allowable Error are set to “NoData” (removed)
- Cells that experienced measured differences larger than the Total Allowable Error were adjusted to remove the error in each cell from the measured difference.
The final figure below shows an example of the raw elevation differences displayed in orange, and the differences adjusted for error shown in grey.
Example of a point demonstrating the raw difference in elevation (CDM2020_2008) and
the difference adjusted for Total Allowable Error at that same location (Merge_2008).
With so much great geospatial data at our fingertips, using that data to assess change over time is going to be an increasingly important and valuable activity. But we can’t do it without accounting for inherent (and known!) errors in the data. Doing so will lead us toward false assumptions and incorrect conclusions about actual changes that may or may not actually be taking place!