How to get historical weather data (min temp, max temp and precipitation) directly from NOAA (National Oceanic and Atmospheric Agency) using Python – Part 1 (Downloading NETCDF files)

NOAA is a US government agency that forecasts weather and monitors oceanic and atmospheric conditions. It is one of the biggest weather agencies in the world and is the basis of all the other weather related apps and websites. In this article I will explore how to get historical weather data directly from NOAA. NOAA also provides forecast data but that we will discuss in an upcoming article.

Historical weather data is useful in analyzing patterns, finding correlation with crop production and also build predictive models for commodity futures pricing. This data is highly valuable and is freely available to use.

In this series of articles we will get into the details on how to get this data and how to process the downloaded files

Data Source

NOAA publishes its weather related datasets at following URL

https://downloads.psl.noaa.gov/Datasets/

There are multiple datasets but we are only concerned about global temperature and precipitation data for this article.

Link to Global Temperature data:-

https://downloads.psl.noaa.gov/Datasets/cpc_global_temp/

Link to Global Precipitation data:-

https://downloads.psl.noaa.gov/Datasets/cpc_global_precip/

Inside the above two links there are individual files for each year of data. NOAA keeps on updating the file for the current year every day. As per latest screenshot it updates it around 16:30 and provides historically accurate weather data with a lag of two days.

Data Format

The data is provided in a NETCDF4 format and has an extension .nc. We will need to keep this in mind while processing the files.

Data Download using Python

Use below code to download the netcdf files for global max temp, min temp and precipitation to your local machine

import wget

base_url = 'https://downloads.psl.noaa.gov/Datasets/'
precip_url = 'cpc_global_precip'
temp_url = 'cpc_global_temp'
latest_year = 2022

output_dir = './'

for var in ['tmin','tmax','precip']:
    file_name = var+'.'+ str(latest_year)+'.nc'
    output_file_path_name = output_dir+file_name
    print(f'Downloading: {file_name} to {output_file_path_name}')
    if var =='precip':
        download_url = base_url+'cpc_global_precip/'+var+'.'+ str(latest_year)+'.nc'
    else:
        download_url = base_url+'cpc_global_temp/'+var+'.'+str(latest_year)+'.nc'
    print(download_url)
    wget.download(download_url,output_file_path_name)

This script will run in few seconds and will download tmax, tmin and precip files to your local machine. Below is the screenshot of files I downloaded

Basic Intro to NETCDF files

NETCDF is a format used in creation, access and sharing of array oriented scientific data. It is commonly used in climatology, meteorology and oceanographic applications. It is also an input output format for many GIS applications.

To quote from their site:

“NetCDF (network Common Data Form) is a set of interfaces for array-oriented data access and a freely-distributed collection of data access libraries for C, Fortran, C++, Java, and other languages. The netCDF libraries support a machine-independent format for representing scientific data. Together, the interfaces, libraries, and format support the creation, access, and sharing of scientific data.”

We will need special Python libraries like xarray, rioxarray, rasterstats. geopandas to process these files. We will also need shape files which contain the geographic coordinates of the area we are interested in extracting the weather data for. Shape files can be easily generated from tools like ArcGIS and QGIS.

Continue Reading

Please follow along for creating shape files and how to extract the weather data in the Part 2 and Part 3 here

One Reply to “How to get historical weather data (min temp, max temp and precipitation) directly from NOAA (National Oceanic and Atmospheric Agency) using Python – Part 1 (Downloading NETCDF files)”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.