We used Digital Soil Mapping (DSM) technologies combined with the real-time collations of soil attribute data from TERN's recently developed Soil Data Federation System, to produce a map of Australian Soil Classification Soil Order classes with quantified estimates of mapping reliability at a 90Â m resolution.
Credit
We at TERN acknowledge the Traditional Owners and Custodians throughout Australia, New Zealand and all nations. We honour their profound connections to land, water, biodiversity and culture and pay our respects to their Elders past, present and emerging.
This work was jointly funded by CSIRO, Terrestrial Ecosystem Research Network (TERN) and the Australian Government through the National Collaborative Research Infrastructure Strategy (NCRIS).
We are grateful to the custodians of the soil site data in each state and territory for providing access to the soil site data, and all of the organisations listed as collaborating agencies for their significant contributions to the project and its outcomes.
CSIRO maintains and makes the data through the Australian Soil Resource Information System.
Purpose
The map gives an estimate of the spatial distribution of soil types across Australia.
Lineage
The map was produced as per methods described at - https://aussoilsdsm.esoil.io/slga-version-2-products/australian-soil-classification-map
Soil classification data was extracted from the SoilDataFederator (SDF) - https://esoil.io/TERNLandscapes/Public/Pages/SoilDataFederator/SoilDataFederator.html
A total of 195,383 observations with either an Australian Soil Classification (ASC) or a Principal Profile Form (PPF) classification or a Great Soil Group (GSG) classification were extracted. Of these observations 130,570 of them had an ASC directly assigned by a pedologist. The remaining 64,813 observations either had a PPF or an ASC assigned to them by pedologists. The PPF and GSG classification where then transformed to an ASC.
The 90Â m raster covariate data was obtained from TERNs publicly available raster covariate stack - https://esoil.io/TERNLandscapes/Public/Products/TERN/Covariates/Mosaics. A parsimonious set of these covariates was used in the modelling.
We used the R "Ranger" Random Forest package to implement a machine learning model as per standard Digital Soil Mapping (DSM) methodologies.
The observed geographic locations in the ASC data set were used to extract cell values from the raster covariate stack using the R "raster" package. This data set was then divided into a 90/10% split of training and external validation sets. The training data was then bootstrapped sampled 50 times to create 50 bootstrap training sets. These training sets were then used to generate 50 Random Forest model realisations.
Using the CSIRO Pearcey High Performance Compute (HPC) cluster the Random Forest models were evaluated against the input covariate raster data stack. This was done for each 90m raster cell across the nation for each of the 50 bootstrapped model realisations. The modal ASC value across the 50 realisations for each cell was determined and assigned as the most probable soil type for that cell in the output raster. The ratio of the second most probable soil to the most probable soil was also calculated to generate a model confusion index, an estimate of the structural uncertainty in the Random Forest model.
The Australian Soil Resource Information System (ASRIS) contains a product that is a compilation of all existing polygon mapping conducted by state and federal soil survey agencies across all of Australia. This product is made up of a diverse range of field mapping products at a range of mapping scales. From this product we extracted all polygons that were mapped at a scale of 1:100,000 or finer, as defined in the Guidelines For Surveying Soil And Land Resources (Blue Book). Polygons mapped at this scale are high quality spatial estimates of the distribution of soil attributes. We then rasterised these polygon ASC values and merged these values into our final estimates of ASC, i.e., where an ASRIS 100,000 scale polygon exists it will replace the modelled ASC value.
All processing for the generation of these products was undertaken using the R programming language. R Core Team (2020).
Soil classification data was extracted from the SoilDataFederator (SDF) - https://esoil.io/TERNLandscapes/Public/Pages/SoilDataFederator/SoilDataFederator.html
A total of 195,383 observations with either an Australian Soil Classification (ASC) or a Principal Profile Form (PPF) classification or a Great Soil Group (GSG) classification were extracted. Of these observations 130,570 of them had an ASC directly assigned by a pedologist. The remaining 64,813 observations either had a PPF or an ASC assigned to them by pedologists. The PPF and GSG classification where then transformed to an ASC.
The 90Â m raster covariate data was obtained from TERNs publicly available raster covariate stack - https://esoil.io/TERNLandscapes/Public/Products/TERN/Covariates/Mosaics. A parsimonious set of these covariates was used in the modelling.
We used the R "Ranger" Random Forest package to implement a machine learning model as per standard Digital Soil Mapping (DSM) methodologies.
The observed geographic locations in the ASC data set were used to extract cell values from the raster covariate stack using the R "raster" package. This data set was then divided into a 90/10% split of training and external validation sets. The training data was then bootstrapped sampled 50 times to create 50 bootstrap training sets. These training sets were then used to generate 50 Random Forest model realisations.
Using the CSIRO Pearcey High Performance Compute (HPC) cluster the Random Forest models were evaluated against the input covariate raster data stack. This was done for each 90m raster cell across the nation for each of the 50 bootstrapped model realisations. The modal ASC value across the 50 realisations for each cell was determined and assigned as the most probable soil type for that cell in the output raster. The ratio of the second most probable soil to the most probable soil was also calculated to generate a model confusion index, an estimate of the structural uncertainty in the Random Forest model.
The Australian Soil Resource Information System (ASRIS) contains a product that is a compilation of all existing polygon mapping conducted by state and federal soil survey agencies across all of Australia. This product is made up of a diverse range of field mapping products at a range of mapping scales. From this product we extracted all polygons that were mapped at a scale of 1:100,000 or finer, as defined in the Guidelines For Surveying Soil And Land Resources (Blue Book). Polygons mapped at this scale are high quality spatial estimates of the distribution of soil attributes. We then rasterised these polygon ASC values and merged these values into our final estimates of ASC, i.e., where an ASRIS 100,000 scale polygon exists it will replace the modelled ASC value.
All processing for the generation of these products was undertaken using the R programming language. R Core Team (2020).