Laurence Watson – Co-founder and CEO @Treebeard Technologies Summary: Satellite imagery and machine learning can be used to estimate emissions from coal power plants across the world, and locate solar panels in the UK. Plentiful data, cloud compute, and open source tools means even small teams can do powerful things – in this talk we’ll go through the steps to setup a machine learning pipeline to analyse overhead imagery and discuss the results and lessons learnt.

For energy analysis, it is necessary to first deeply understand the distribution of the underlying infrastructure. Only then can the steps to create a representative and useful dataset be taken. There are many different imagery providers and datasets – the strengths and weaknesses of different data sources are discussed along with techniques to pre-process them. Satellite and aerial imagery and geospatial data need some extra steps compared to plain image analysis, but python has several powerful packages like rasterio to handle them.

Computer vision CNNs and transfer learning mean training a model on geospatial raster data in a pipeline is reasonably straightforward. This talk will go through the necessary steps for a generic task, and the code will be available. Open source geospatial learning tools like RasterVision can jump-start this kind of project, but come with their own learning curves.

Specific lessons learned from two projects, understanding coal power plant output from smoke plumes, and identifying solar PV from aerial imagery will be explained with particular emphasis on data science techniques used.
From an initial pilot, understanding the output of coal power plants based on visual band imagery of smoke plumes has led to a significant project supported by a Google AI Impact grant currently underway by WattTime and the Carbon Tracker Initiative.