Reading and writing netCDF-format files

Matlab provides both low- and high-level scripts for reading and writing netCDF files. One of the high-level scripts is "ncinfo" which will read the header of the file, and give you the variable names, dimensions, and attributes for the file. Alternatively you can download the command line utility "ncdump" in the NetCDF library available here. The utility "ncdump" will print the header, a single variable, or the entire file. If you are new to reading netCDF or other self-describing format files, the ability to find out the variable names, dimensions, and attributes in a file before you try a script or utility is invaluable.

fnin.nc is the input filename fnout.nc is the output filename finfo = ncinfo( fnin.nc ); ncwriteschema( fnout.nc, finfo); "ncwriteshema" writes the header information, but no data to fnout.nc. My four-dimensional variable is fdat where fdat = fdat(time,level,lat,lon) from ncdump of fnin.nc. The following lines write the longitude, latitude, vertical dimension, and time ncwrite( fnout.nc, 'lon', longitudes_variable ) ncwrite( fnout.nc, 'lat', latitudes_variable ) ncwrite( fnout.nc, 'level', vertical_variable ) ncwrite( fnout.nc, 'time', time_variable ) A source of error here is that the variable type: double, integer, short, ... has to match what the fnout.nc file header expects. nx = number of longitude points ny = number of latitude points nz = number of vertical points nt = number of time points fdat = ncread( fnin.nc, variablename, [ 1 1 1 icnt ], [ nx ny nz 1 ] ); reads a data volume for the icnt'th time Note: fdat is dimensioned * fdat(lon,lat,level,time) inside the MATLAB session * fdat(time,level,lat,lon) in ncdump output (outside of the matlab session) ncwrite( fnout.nc, variablename, fdat ); writes all of fdat ncwrite( fnout.nc, variablename, fdat, [ 1 1 1 icnt ] ); where fdat is dimensioned fdat(lon,lat,level) inside the Matlab session ncwrite( fnout.nc, variablename, fdat(icnt,:,:,:), [ 1 1 1 icnt ] ); also works, where fdat is dimensioned fdat(time,lon,lat,level).If the variable has missing values, you define the FillValue in nccreate: " 'FillValue', value ". Do not use NAN as the FillValue. The Matlab documentation uses NaN, but netCDF operators (NCOs) may or may not interpret the NAN correctly. Instead use a number for the missing value that won't be found in your data.Example 2: Writing a 4-dimensional variable into a file from which there is no other file to copy header information from. A script that does much of the following. Save yourself some typing ! nccreate( 'fnout.nc', 'w', 'Dimensions', { 'x', 211, 'y', 121, 'level', ... 50, 'time', inf } );

ncwrite( 'fnout.nc', 'w', w ); ncwriteatt( 'fnout.nc', 'w', 'long_name', 'vertical velocity' ); ncwriteatt( 'fnout.nc', 'w', 'units', 'Pa/s' ); nccreate( 'fnout.nc', 'lon', 'Dimensions', { 'x', 211, 'y', 121 }, ... 'DataType', 'single' ); ncwrite( 'fnout.nc', 'lon', xgrid2 ); % "xgrid2" is the 2-dimensional matrix of longitudes ncwriteatt( 'fnout.nc', 'lon', 'long_name', 'longitude' ); ncwriteatt( 'fnout.nc', 'lon', 'units', 'degrees_east' ); nccreate( 'fnout.nc', 'lat', 'Dimensions', { 'x', 211, 'y', 121 }, ... 'DataType', 'single' ); ncwrite( 'fnout.nc', 'lat', ygrid2 ); % "ygrid2" is the 2-dimensional matrix of latitudes ncwriteatt( 'fnout.nc', 'lat', 'long_name', 'latitude' ); ncwriteatt( 'fnout.nc', 'lat', 'units', 'degrees'_north ); nccreate( 'fnout.nc', 'time', 'Dimensions', { 'time', inf } ); ncwrite( 'fnout.nc', 'time', time_value ); % a vector of time values relative to a reference time ncwriteatt( 'fnout.nc', 'time', 'long_name', 'time' ); ncwriteatt( 'fnout.nc', 'time', 'units', ... 'seconds since 1970-01-01 00:00:00.0 0:00' );

Your MATLAB code will need both the MexCDF and MexEPS libraries to run these scripts. I apologize for this inconvenience, but I am a scientist and not a programmer, so I am forced to use everyone elses libraries. Use the following code at your own risk.

The scripts comply with the netCDF standard of the Cooperative Ocean-Atmosphere Research Data Service (COARDS). The metadata information (header) for a simple file written to COARDS specifications is here.

The COARDS standard is for time to be written as "so many units of time since some reference time." Writing and reading time in this way complies with the udunits standard, and keeping track of leap years and historical changes in calendars requires the MexEPS libraries, which are expertly managed by Willa Zhu of NOAA PMEL. The "nc_read.m" script described below will also read files where time is written as "yyyymmddhhmmss" (year, month, day, ...), which is an older in-house convention. The "nc_write.m" script writes the time variable in the COARDS fashion.

**Reading a map:**

[ data, i4, r4, nam, xgrid, ygrid ] = nc_read( filename, record);

where:

"data" is a vector of the data,

"i4, r4" are vectors containing metadata: number of latitude and longitude points, first latitude point, etc.,

"nam" (optional) variable name

"xgrid, ygrid" (optional) vectors of longitude and latitude points, respectively

"filename" name of file to be read in

"record" number of the record to be read in

The header of the MATLAB script explains these variables in more detail.

nc_read.m -- the MATLAB script.

**Writing a map:**

nc_write( filename, record, data, nam, i4, r4 );

where

"filename" is the output filename,

"record" is the record number to write,

"data" is a vector containing the data,

"nam" a variable name or units information,

"i4, r4" are vectors containing metadata: number of latitude and longitude points, first latitude point, etc.,

The header of the MATLAB script explains these variables in more detail.

nc_write.m -- the MATLAB script.

Converting 3-dimensional arrays to 2-dimensions

Matlab now handles arrays with greater than 2 dimensions, which is useful for calculating means along latitude or longitude circles or spatial differences. The great majority of calculations, however, treat each grid point as an independent timeseries, and it is simpler to manipulate the data as a two-dimensional matrix, A(t,x). The following 2 lines will reduce an input 3-dimensional array to 2-dimensions.

Consider A = A(nt,ny,nx) where nt, ny, nx are the number of
time, latitude, and longitude points, respectively.

```
A = permute( A, [ 3 2 1 ] );
```

"A" will be dimensioned A(nt,nx*ny), and the second index will first
span the first latitude circle and then the second latitude circle,
and so forth.

A = squeeze( reshape( A, nx*ny,nt,1 ) )' ;

threetotwo.m -- the MATLAB script.

Filling partial years of data in arrays and time series

For some reason, datasets and time series are still written for partial years of data. In a world of gigabytes and terabytes, people are still worried about bytes and kilobytes. The following script will append missing values ("NaN"s) to time series or data arrays to make a complete year of data. The script can handle monthly or other time resolution data (but not Leap Days for now --- it would be easy to do).

fill_year.m -- the MATLAB script.

Saving data in ascii format

It is sometimes useful to save data in ascii format, and this is especially true for climate indices where you might want to look at values for selected periods without having to ingest data into software.

As of version R2012b, and possibly a little before, this is accomplished in Matlab with

`save filename.ascii -ascii variable_name(s)`

where "variable_name(s)" have to be in double precision. The error message for an incorrect variable type is cryptic.

Interpolating 3-dimensional data

Interpolating 3-dimensional data is straightforward, but having an example to follow can save you a lot of time!

I read in a dataset whose ncdump has a variable to interpolate, A = A(time,level,y,x), lon = lon(y,x), lat = lat(y,x), and level = level(z). The dataset I read is not an equal angle dataset.

Read in a 3-dimensional volume for the icnt'th time of the 4-dimensional dataset.

fdat = ncread( filename, A, [ 1 1 1 icnt ], [ nx ny nz 1 ] ); whos fdat % nx * ny * nz % griddata wants vector input lon2 = reshape( lon, nx*ny, 1 ); lon2 = reshape( lon2*ones(1,nz), ny*nx*nz, 1 ); lat2 = reshape( lat, nx*ny, 1 ); lat2 = reshape( lat2*ones(1,nz), ny*nx*nz, 1 ); zgrid2 = reshape( ones(nx*ny,1)*level(1:nz)', ny*nx*nz, 1 ); fdat2 = reshape( fdat, nx*ny*nz, 1, 1 ); % Define the output grid. These variables are in 3 dimensions. [ lon3, lat3, zgrid3 ] = meshgrid( [ -126+1/6/2:1/6:-113-1/6/2 ], ... [ 43-1/6/2:-1/6:31+1/6/6 ], level(1:nz) ); % ny * nx * nz fdat3 = griddata( lon2, lat2, zgrid2, fdat2, lon3, lat3, zgrid3 ); whos fdat3 % ny * nx * nz

September 2013 Todd Mitchell ( mitchell@atmos.washington.edu )

/home/disk/margaret/jisao/data_sets JISAO data