Reading and writing netCDF-format files
Matlab provides both low- and high-level scripts for reading and
writing netCDF files. One of the high-level scripts is "ncinfo" which
will
read the header of the file, and give you the variable
names,
dimensions, and attributes for the file. Alternatively you can download the
command line utility "ncdump" in the NetCDF
library available
here. The utility "ncdump" will print the header, a single variable, or the entire
file. If you are new to reading netCDF or other self-describing
format files, the ability to find out the variable names, dimensions,
and attributes in a file before you try a script or utility is invaluable.
The Matlab provided high-level scripts are "nccreate", "ncdisp",
"ncinfo", "ncread", "ncreadatt", "ncwrite", "ncwriteatt", and "ncwriteschema".
Type "help command_name" to get information on how to use the script.
I have found the Matlab WWW documentation to be very good.
Example: The following MATLAB commands were used to write a netCDF file for a
4-dimensional variable with MATLAB version 2012b. These lines should give you a good idea of
what MATLAB wants.
I start with an existing netCDF-format file, and copy the header
information to a new file. In the absence of a netCDF file to copy
the header from, writing the header information (variable and attribute
names and dimensions) is accomplished with nccreate and ncwriteatt I
believe.
fnin.nc is the input filename
fnout.nc is the output filename
finfo = ncinfo( fnin.nc );
ncwriteschema( fnout.nc, finfo);
"ncwriteshema" writes the header information, but no data to fnout.nc.
My four-dimensional variable is fdat where fdat = fdat(time,level,lat,lon) from ncdump of fnin.nc.
The following lines write the longitude, latitude, vertical dimension,
and time
ncwrite( fnout.nc, 'lon', longitudes_variable )
ncwrite( fnout.nc, 'lat', latitudes_variable )
ncwrite( fnout.nc, 'level', vertical_variable )
ncwrite( fnout.nc, 'time', time_variable )
A source of error here is that the variable type: double, integer,
short, ... has to match what the fnout.nc file header expects.
nx = number of longitude points
ny = number of latitude points
nz = number of vertical points
nt = number of time points
fdat = ncread( fnin.nc, variablename, [ 1 1 1 icnt ], [ nx ny nz 1 ] );
reads a data volume for the icnt'th time
Note: fdat is dimensioned
* fdat(lon,lat,level,time) inside the MATLAB session
* fdat(time,level,lat,lon) in ncdump output (outside of the matlab session)
ncwrite( fnout.nc, variablename, fdat );
writes all of fdat
ncwrite( fnout.nc, variablename, fdat, [ 1 1 1 icnt ] );
where fdat is dimensioned fdat(lon,lat,level) inside the Matlab session
ncwrite( fnout.nc, variablename, fdat(icnt,:,:,:), [ 1 1 1 icnt ] );
also works, where fdat is dimensioned fdat(time,lon,lat,level).
Example 2: Writing a 4-dimensional variable into a file from which there is no other file to copy header information from.
A script that does much of the following. Save yourself some typing !
nccreate( 'fnout.nc', 'w', 'Dimensions', { 'x', 211, 'y', 121, 'level', ...
50, 'time', inf } );
If the variable has missing values, you define the FillValue in
nccreate: " 'FillValue', value ". Do not use NAN as the
FillValue. The Matlab documentation uses NaN, but netCDF operators (NCOs) may or may not interpret the NAN correctly. Instead
use a number for the missing value that won't be found in your data.
ncwrite( 'fnout.nc', 'w', w );
ncwriteatt( 'fnout.nc', 'w', 'long_name', 'vertical velocity' );
ncwriteatt( 'fnout.nc', 'w', 'units', 'Pa/s' );
nccreate( 'fnout.nc', 'lon', 'Dimensions', { 'x', 211, 'y', 121 }, ...
'DataType', 'single' );
ncwrite( 'fnout.nc', 'lon', xgrid2 ); % "xgrid2" is the 2-dimensional matrix of longitudes
ncwriteatt( 'fnout.nc', 'lon', 'long_name', 'longitude' );
ncwriteatt( 'fnout.nc', 'lon', 'units', 'degrees_east' );
nccreate( 'fnout.nc', 'lat', 'Dimensions', { 'x', 211, 'y', 121 }, ...
'DataType', 'single' );
ncwrite( 'fnout.nc', 'lat', ygrid2 ); % "ygrid2" is the 2-dimensional matrix of latitudes
ncwriteatt( 'fnout.nc', 'lat', 'long_name', 'latitude' );
ncwriteatt( 'fnout.nc', 'lat', 'units', 'degrees'_north );
nccreate( 'fnout.nc', 'time', 'Dimensions', { 'time', inf } );
ncwrite( 'fnout.nc', 'time', time_value ); % a vector of time values relative to a reference time
ncwriteatt( 'fnout.nc', 'time', 'long_name', 'time' );
ncwriteatt( 'fnout.nc', 'time', 'units', ...
'seconds since 1970-01-01 00:00:00.0 0:00' );
Your MATLAB code will need both the MexCDF and MexEPS libraries to run these scripts. I apologize for this inconvenience, but I am a scientist and not a programmer, so I am forced to use everyone elses libraries. Use the following code at your own risk.
The scripts comply with the netCDF standard of the Cooperative Ocean-Atmosphere Research Data Service (COARDS). The metadata information (header) for a simple file written to COARDS specifications is here.
The COARDS standard is for time to be written as "so many units of time since some reference time." Writing and reading time in this way complies with the udunits standard, and keeping track of leap years and historical changes in calendars requires the MexEPS libraries, which are expertly managed by Willa Zhu of NOAA PMEL. The "nc_read.m" script described below will also read files where time is written as "yyyymmddhhmmss" (year, month, day, ...), which is an older in-house convention. The "nc_write.m" script writes the time variable in the COARDS fashion.
Reading a map:
[ data, i4, r4, nam, xgrid, ygrid ] = nc_read( filename, record);
where:
"data" is a vector of the data,
"i4, r4" are vectors containing metadata: number of latitude and longitude points, first latitude point, etc.,
"nam" (optional) variable name
"xgrid, ygrid" (optional) vectors of longitude and latitude points, respectively
"filename" name of file to be read in
"record" number of the record to be read in
The header of the MATLAB script explains these variables in more detail.
nc_read.m -- the MATLAB script.
Writing a map:
nc_write( filename, record, data, nam, i4, r4 );
where
"filename" is the output filename,
"record" is the record number to write,
"data" is a vector containing the data,
"nam" a variable name or units information,
"i4, r4" are vectors containing metadata: number of latitude and longitude points, first latitude point, etc.,
The header of the MATLAB script explains these variables in more detail.
nc_write.m -- the MATLAB script.
Converting 3-dimensional arrays to 2-dimensions
Matlab now handles arrays with greater than 2 dimensions, which is
useful for calculating means along latitude or
longitude circles or spatial differences. The great majority of
calculations, however, treat each grid point as an independent
timeseries, and it is simpler to manipulate the data as a
two-dimensional matrix, A(t,x). The following 2 lines will reduce an input
3-dimensional array to 2-dimensions.
Consider A = A(nt,ny,nx) where nt, ny, nx are the number of
time, latitude, and longitude points, respectively.
threetotwo.m -- the MATLAB script.
For some reason, datasets and time series are still written for partial
years of data. In a world of
gigabytes and terabytes, people are still worried about bytes and
kilobytes. The following script will append missing values ("NaN"s)
to time series or data arrays to make a complete year of data. The
script can handle monthly or other time resolution data (but not Leap
Days for now --- it would be easy to do).
fill_year.m -- the MATLAB script.
It is sometimes useful to save data in ascii format, and this is
especially true for climate indices where you might want to look at
values for selected periods without having to ingest data into
software.
As of version R2012b, and possibly a little before, this is
accomplished in Matlab with
where "variable_name(s)" have to be in double precision. The error
message for an incorrect variable type is cryptic.
Interpolating 3-dimensional data is straightforward,
but having an example to follow can save you a lot of time!
I read in a dataset whose ncdump has a variable to interpolate, A =
A(time,level,y,x), lon = lon(y,x), lat = lat(y,x), and level = level(z). The dataset I
read is not an equal angle dataset.
Read in a 3-dimensional volume for the icnt'th time of the 4-dimensional dataset.
A = permute( A, [ 3 2 1 ] );
"A" will be dimensioned A(nt,nx*ny), and the second index will first
span the first latitude circle and then the second latitude circle,
and so forth.
A = squeeze( reshape( A, nx*ny,nt,1 ) )' ;
Filling partial years of data in arrays and time series
Saving data in ascii formatsave filename.ascii -ascii variable_name(s)
Interpolating 3-dimensional data
fdat = ncread( filename, A, [ 1 1 1 icnt ], [ nx ny nz 1 ] );
whos fdat
% nx * ny * nz
% griddata wants vector input
lon2 = reshape( lon, nx*ny, 1 );
lon2 = reshape( lon2*ones(1,nz), ny*nx*nz, 1 );
lat2 = reshape( lat, nx*ny, 1 );
lat2 = reshape( lat2*ones(1,nz), ny*nx*nz, 1 );
zgrid2 = reshape( ones(nx*ny,1)*level(1:nz)', ny*nx*nz, 1 );
fdat2 = reshape( fdat, nx*ny*nz, 1, 1 );
% Define the output grid. These variables are in 3 dimensions.
[ lon3, lat3, zgrid3 ] = meshgrid( [ -126+1/6/2:1/6:-113-1/6/2 ], ...
[ 43-1/6/2:-1/6:31+1/6/6 ], level(1:nz) );
% ny * nx * nz
fdat3 = griddata( lon2, lat2, zgrid2, fdat2, lon3, lat3, zgrid3 );
whos fdat3
% ny * nx * nz
September 2013
Todd Mitchell ( mitchell@atmos.washington.edu
)
/home/disk/margaret/jisao/data_sets
JISAO data