The map was produced as per methods described at -
https://aussoilsdsm.esoil.io/slga-version-2-products/soil-colour
Soil colour is arguably one of the most obvious and easily observed soil morphological characteristics. Soil scientists use soil colour to differentiate genetic soil horizons as well as for the classification of soil types, e.g. The Australian Soil Classification.
In Australia, prior work of mapping the colour of Australian soils was performed by Viscarra Rossel et al. (2010), but was limited to just surface soils, output mapping to 5 km spatial resolution, and only utilised a relatively small collection of vis-NIR spectra (from which colour was inferred) to develop spatial soil colour models.
From data discovery via the Australian Soil Data Federator, we were able to compile over 300 000 soil colour field observations (dry soil condition) collected across Australia. About 160 000 were for topsoils, while about 140 000 were for subsoils. Rather than exclusively using vis-NIR spectra, a logical line of investigation is to exploit the availability of a comparatively larger field observed dataset.
Colour Space Conversions
Field classification of soil colours are near exclusively recorded using the Munsell HVC (Hue, Value, Chroma) colour system. Munsell HVC soil colour descriptions are not conducive for quantitative studies (Robertson 1977). Using a lookup table, we performed a conversion from the Munsell HVC colour space to the CIELAB colour space. The CIELAB colour space can describe any uniform colour space by the three variables: L*, a*, and b*. Each variable represents the lightness of the colour (L* = 0 yields black and L* = 100 indicates diffuse white), its position between red/magenta and green (a*, negative values indicate green while positive values indicate magenta) and its position between yellow and blue (b*, negative values indicate blue and positive values indicate yellow).
Digital soil mapping
Random Forest machine learning was used to independently model L*, a*, and b* target variables as a function of a suite of available national extent environmental covariates. While we did investigate various options for combined target variable modelling given the covarying relationships of the colour variables, neither were able to match the prediction skill of the independently treated approach. The L* variable was modelled as a categorical variable, both a*, and b* were modelled as continuous variables. For both top- and subsoil models, a dataset (n=10000) was selected out of each of the available datasets prior to any modelling for the sole purpose of evaluating the goodness of fit of the fitted models, akin to an out-of-bag model evaluation.
After modelling, the combined L*, a*, and b* were post-processed to line up the nearest HVC colour space chip using Euclidean distance quantification.
For colour visualisation of the soil colour maps, predictions were transformed to the RGB colour space using the same lookup table as for the conversion form Munsell HVC to CIELAB.
All processing for the generation of these products was undertaken using the R programming language. R Core Team (2020).
Code -
https://github.com/AusSoilsDSM/SLGA
Observation data -
https://esoil.io/TERNLandscapes/Public/Pages/SoilDataFederator/SoilDataFederator.html
Covariate rasters -
https://esoil.io/TERNLandscapes/Public/Pages/SLGA/GetData-COGSDataStore.html