Tutorial: Fetch and Load
Note
You can download an executable version (Jupyter Notebook) of this tutorial
here
.
Fetching and Loading Environmental Data¶
In this tutorial, we will use Kadlu to retrieve environmental data from online sources and load the data into numpy arrays for further processing.
The first step is to import kadlu. Note that the datetime package is used to specify dates and times
from datetime import datetime
import kadlu
Quick start guide¶
With Kadlu, environmental data can be downloaded and stored in one step. Here, we demonstrate how to obtain modeled surface salinity data from HYCOM for the geographic region $47^{\circ}$N to $49^{\circ}$N and $-63^{\circ}$W to $-61^{\circ}$W for the first week of January 2013.
# fetch and load salinity (g/kg salt in water)
salinity, lat, lon, epoch, depth = kadlu.load(
source='hycom', var='salinity',
south=47, west=-63,
north=49, east=-61,
bottom=0, top=0,
start=datetime(2013, 1, 1), end=datetime(2013, 1, 7))
Note how the arguments bottom
and top
are both set to 0
, thereby selecting only data at a depth of 0 m, i.e., at the surface.
The load
function produced flattened numpy arrays, the length of which corresponds to the number of data points in the selected geographic region, depth range, and temporal window.
# print the first 10 values of each array
print(lat[0:10]) # latitude (degrees north)
print(lon[0:10]) # longitude (degrees west)
print(depth[0:10]) # depth (meters)
print(epoch[0:10]) # time (hours since 00:00:00 on 1 January 2000)
print(salinity[0:10]) # ocean salt content (g/kg)
We can use the epoch_2_dt
function to convert the time values into a more human-friendly date-time format,
print(kadlu.epoch_2_dt(epoch[0]))
Data sources¶
Kadlu includes functionality to load data from a variety of different data sources. For a high level overview, print the source_map:
print(kadlu.source_map)
for more information on a specific source, print the class:
print(kadlu.hycom())
keyword arguments can be passed as a dictionary when using the same load arguments for multiple datatypes
kwargs = dict(
south=47, west=-63,
north=49, east=-61,
bottom=0, top=0,
start=datetime(2013, 1, 1), end=datetime(2013, 1, 7))
bathy1, lat1, lon1 = kadlu.load(source='gebco', var='bathymetry', **kwargs)
waveheight2, lat2, lon2, epoch2 = kadlu.load(source='era5', var='waveheight', **kwargs)
Manual loading from netcdf and geotiff¶
Kadlu can load from arbitrary netcdf- and geotiff-formatted data using the functions 'load_netcdf_2D' and 'load_geotiff_2D'. In the case of netcdf databases, the data must contain three variables, two of which are 'lat' and 'lon'. Kadlu will make an assumption that the X and Y axis are specified in coordinate degrees.
kwargs = dict(south=47, west=-63, north=49, east=-61)
bathy3, lat3, lon3 = kadlu.load_netcdf_2D(filename='/storage/gebco_bathy.nc', **kwargs)
When loading from an arbitrary netcdf database, data transformation must be done by the user. For example, when loading GEBCO netcdf data directly from the file instead of the gebco().load_bathymetry function, bathymetric values will be returned as a measure of elevation (equal to depth * -1)
# returns a 2D array of elevation values
bathy3