Skip to main content

What is prototyping?  Key takeaways from prototyping a  machine learning tool to predict deforestation risks 

Blog | Mon, 19 Feb, 2024 · 9 min read

To prevent future deforestation, it is crucial for policymakers to know which areas are at the highest risk and to identify the drivers creating these risks. Various tools already exist to estimate future levels of deforestation. However, there is a need for improved accuracy and the ability to predict further ahead.


In 2023, UN-REDD collaborated with the Data Science for Social Good programme in the UK (DSSGx UK) to prototype a predictive model to assess the likelihood of deforestation over time.  

The Amazon biome was chosen as a test case for this prototyping exercise because it plays a crucial role in global biodiversity and climate change mitigation and is under threat from deforestation. Several factors contribute to deforestation in the Amazon, including environmental events and human-related activities.

This prototype, hosted on Google Earth Engine (GEE), demonstrates the importance of prototyping in creating AI models or nature-tech tools. Prototyping, often an overlooked step, is crucial for testing concepts, functionalities, and user experiences early in the development process.

The project sheds some insights and lessons learned in the prototyping phase:

1.    Starting with a clear vision of an end-product and user engagement is essential.

The project was driven by a clear goal: to develop a machine learning tool that identifies areas at risk of deforestation in the Amazon biome up to three years in advance. 

We aim for it to be useful for decision-makers to support the creation of relevant policies and regulations by providing accurate predictions of deforestation risks, thereby helping to prioritize conservation efforts and potentially also manage carbon stocks more effectively. 

During the project initiation phase, the team reached out to potential end users and key stakeholders to understand their needs and the potential usability of the tool.  This step is crucial for developing a user-centric tool that is both practical and beneficial.

2.    Ensure to design for accessibility, simplicity and intuitiveness.

To ensure its accessibility, the tool was designed with GEE and is visualized with a color-coded system, with shades ranging from red to orange - where red signifies areas of the highest risk for the selected prediction year. Upon interaction, users can highlight feature importance, correlating to the main factors driving deforestation.

Additionally, the tool offers the capability to overlay additional data, such as carbon density and the presence of Indigenous lands, to provide a comprehensive understanding of the areas under threat.

3.    The choice of methodologies and models is shaped by available data.

Over approximately 12 weeks, the team processed hundreds of gigabytes of various datasets from remote sensing and land cover sources such as MapBiomas and OpenStreetMap - a massive task which required 10TB of storage space.

After processing, team then trained an AI model (a convolutional neural network called U-Net) to learn historical deforestation patterns, and to identify areas that may be at risk of deforestation in the future. Other methods and approaches such as simple averaging and random forest were also tried and tested before deciding for U-Net which yielded the most accurate results. 

In addition, the team applied sophisticated interpretability techniques to identify the risk factors that were most important to the predictions, before deploying it to an app.

This approach illustrates the necessity of principled model selection and the application of advanced interpretability techniques, ensuring that the tool is both accurate and insightful.

4.    Techniques like feature ablation can help better interpret results.

The team employed a method called feature ablation to understand the contribution of each factor. By removing factors one at a time and comparing the results to a baseline, we could measure each factor's significance. 

The features were categorized intro three: provocative features (pasture and mining); inhibitive features (indigenous areas and protected areas); and human-activity born features (recent deforestation, distance to roads and forest edge density). 

Based on the feature ablation technique, the tool identified pasture quality, distance to roads, and distance to recent deforestation as the most important risk factors, which are mostly human-induced. 

This approach helped dentify which factors have the most influence on the model's predictions, allowing potential users to prioritize further studies to establish causation and ultimately inform conservation strategies.

5.    Actionable insights are crucial for the success of the tool. 

Beyond mere risk prediction, the tool offers insights into the causes of deforestation, like agricultural expansion and illegal logging, and identifies specific areas at risk. It allows for targeted conservation, restoration, carbon finance or REDD+ strategies. 

For example, one key insight is the effectiveness of Indigenous areas in preventing deforestation, highlighting the critical role of Indigenous stewardship in forest conservation.


Next steps

The tool continues to be a work in progress, with the first year's predictions outperforming other models, though accuracy for the second and third years needs improvement, likely due to data limitations.

However, although not fully final yet, we are sharing these insights for how prototyping can be an essential step for ensuring the usefulness of the tool – together with user engagement and actionable insights.  

To find out more about the DSSGx project, visit:  If you are at UNEA-6, visit UNEP’s Digital Display to test out the tool or contact the author for more information. 

Note: The author wishes to thank and acknowledge the DSSGx project team: Andrew Kitchen; Jack Buckingham; Satyam Suman, Sanya Sinha, Karan Uppal, Dmytro Holdnikov, Juergen Branke and Satyam Bhagwanani.