Have you ever wondered where the data in your weather app comes from? There is a lot of science behind every number and map in your local forecast on your phone. Let’s start with weather observations. At thousands of locations worldwide, the temperature, humidity, rainfall, windspeed, and many more variables are measured at places like airports, for instance.
Probably, you are more interested in what the weather will be tomorrow or next week. Forecasts are calculated with weather forecasting models at places like the Met Office, ECMWF, NOAA or the Weather Company. All these models are built around similar physical processes of atmospheric and ocean circulation and interactions with the land surface. They are similar to the global circulation models (GCMs) that forecasters use to predict the climate over the next century, but with different configurations related to space and time scales.
The output of these models is a regular grid for each weather variable. A frequently used data format is netcdf, about which I wrote a blogearlier this year. The forecast on your phone is extracted from this big data.
But how does the data end up on your phone? Weather apps use HTTP-based REST APIs to handle individual items and collections of items in a clean, data-driven way. These APIs give you easy access for retrieving and filtering data, and well-defined methods of dealing with other CRUD operations.
This tutorial shows how to:
- retrieve weather forecasts and transform the json data into a pandas DataFrame
- create forecast timeseries
- create a weather map
Set up services on Bluemix
Start by setting up a Weather Company Data service on Bluemix:
- Login to Bluemix (or sign up for a free trial).
- In the menu at the top of the screen, click Catalog.
- Search for Weather Company Data for IBM Bluemix and click on it.
- Scroll down to select the free plan and click Create.
- On the left, click Service Credentials and copy your
password, which you will need in a minute.
Now you’re ready to start using the APIs to retrieve weather data. You’ll see how to use a Python notebook to retrieve the weather data, transform it, and make charts and maps. You can do this on your own computer using the Anaconda Python distribution. Or you can run a Python notebook on the IBM Data Science Experience as we do in this tutorial. If you run into errors due to missing packages, you can install them by running the following command in your notebook:
!pip install --user . You have to do so only once.
Follow these steps to create a notebook:
- Log in to the IBM Data Science Experience with your Bluemix account credentials.
- If prompted, create an instance of the Apache Spark service.Data Science Experience may generate a Spark instance for you automatically. If not, you’ll be prompted to instantiate your own. You’ll need it to run your Python code.
- Create a new notebook.On the upper left of the screen, click the hamburger menu to reveal the left menu. Then click New > Notebook.
- Create or select your Spark instance.
Retrieve weather forecast data
The first step is to load data into your notebook with the Weather Company Data API. In Bluemix, you can find a complete list of the available APIs and examples of how to use them. To load a 10-day forecast for London (latitude=51.49999473, longitude=-0.116721844), copy the following code into your notebook, replacing
<PASSWORD> with the credentials you saved earlier. It loads the data using the requests package, which is a HTTP library for Python. Click run at the top of the notebook to load the data. To load data for a different location, you can look up the latitude and longitude here and add them to the code.
Tranform json data into a pandas DataFrame
The data you loaded into your notebook is json data. The json format is great to store data, but for analysis and visualisations you need to convert it. Copy and run the following code to convert it into a pandas DataFrame. To learn more about this format, read my last blog on analysing open data sets.
Since json data has the same structure as a dictionary you can use the pandas
DataFrame_from_dict() function to convert the data. But because the data is nested, with a list of forecasts for different time periods, you have to loop over each of them. The following code handles this by appending each forecast to the DataFrame
transpose() makes sure each time period is a new row in the DataFrame. To make the DataFrame even easier to use, I added time as the index. To convert the string
fcst_valid_localinto a datetime object you can use
datetime.strptime(). Then add an index to the DataFrame with
Create forecast timeseries
The data is now in an easy-to-use DataFrame, and you can create timeseries plots for the different variables. After checking out the data, I noticed that
pop (percentage of precipitation) looked strange (have a look yourself with
print df.pop). There is more than one column here, which results in errors when you try to plot the data. I fixed this by adding a column
rain and then deleting
The forecast data comes in 6 hour time intervals. I added a rolling mean to the plots that you can easily calculate with
You can now plot the timeseries using the matplotlib package. This package gives you full control over your figure. There are several stylesavailable (I chose
bmh). The figure consists of 5 rows and 1 column, which you specify with
subplots(). The different subplots are then accessed by
ax=axes. You can choose any colour you like.
Copy the following code into your Python notebook and run it to create a figure with the weather forecast for the next 10 days.
Create a weather forecast map
The Weather Company Data service lets you download a set of iconsthat you can use in a weather map. I adapted code I found on Stack Overflow to make it all work and stored the icons on github, where my notebook can access them.
To start creating a map, you need a list of locations for the area you want to show. I compiled a list of cities, including their latitude and longitude, to make a weather map for the UK. Loop over these cities to load the data into two arrays containing the icons and temperatures for each city. There are many other variables available. Have a go at making maps for different countries or regions.
You can use the
temps arrays to create two maps. The code that follows produces 2 maps: one shows a weather icon for each location in the city list and the other shows the temperature at each location. To create maps, you need the package
Basemap that takes care of most of the formatting and the geographical projection and background colours.
- Just as we did for our forecast graphs, define the number of subplots using
plt.subplot. This gives you the handles
- Define the area to plot (range of latitudes and longitudes) and the projection with
Basemap. My example uses the Miller Cylindrical Projection
mill, but there are many more projections to choose from.
- I need a nice visual image as background map of the UK.
drawlsmaskcreates a simple background with land and ocean in any colour you prefer. Many different backgrounds are available.
- To plot the icons for each location, loop over the cities, read the png icon using
urllib, and add the icon to the first map with
- The temperature map has a colour scale, which is defined with
plt.get_cmap(). Choose your own at matplotlib.org.
- To create the temperature map, loop over all cities again. To generate the coloured boxes, this code normalizes the temperature between 0 and 30 degrees.
bbox_propsdefines the shape and colour of the box around the
Now you are a weather forecaster! You learned how to load weather data into a Python notebook and use it to create forecast graphs and maps. Explore further by changing the locations to create maps for other regions, then customize with your own layout and colours.
By Margriet Groenendijk