JISAO data

Select MATLAB scripts


  • Reading and writing netCDF-format files
  • Converting 3-dimensional arrays to 2-dimensions
  • Filling partial years of data in arrays and time series
  • Saving data in ascii format
  • Interpolating 3-dimensional data
  • Strings and structures
    
    

    
    
    Reading and writing netCDF-format files

    Matlab provides both low- and high-level scripts for reading and writing netCDF files. One of the high-level scripts is "ncinfo" which will read the header of the file, and give you the variable names, dimensions, and attributes for the file. Alternatively you can download the command line utility "ncdump" in the NetCDF library available here. The utility "ncdump" will print the header, a single variable, or the entire file. If you are new to reading netCDF or other self-describing format files, the ability to find out the variable names, dimensions, and attributes in a file before you try a script or utility is invaluable.

    The Matlab provided high-level scripts are "nccreate", "ncdisp", "ncinfo", "ncread", "ncreadatt", "ncwrite", "ncwriteatt", and "ncwriteschema". Type "help command_name" to get information on how to use the script. I have found the Matlab WWW documentation to be very good.

    Example: The following MATLAB commands were used to write a netCDF file for a 4-dimensional variable with MATLAB version 2012b. These lines should give you a good idea of what MATLAB wants.

    I start with an existing netCDF-format file, and copy the header information to a new file. In the absence of a netCDF file to copy the header from, writing the header information (variable and attribute names and dimensions) is accomplished with nccreate and ncwriteatt I believe.

    fnin.nc  is the input filename
    fnout.nc is the output filename
    
    finfo = ncinfo( fnin.nc );
    ncwriteschema( fnout.nc, finfo);
    "ncwriteshema" writes the header information, but no data to fnout.nc.
    
    My four-dimensional variable is fdat where fdat = fdat(time,level,lat,lon) from ncdump of fnin.nc.  
    
    The following lines write the longitude, latitude, vertical dimension,
    and time
    ncwrite( fnout.nc, 'lon', longitudes_variable )
    ncwrite( fnout.nc, 'lat', latitudes_variable )
    ncwrite( fnout.nc, 'level', vertical_variable )
    ncwrite( fnout.nc, 'time', time_variable )
    A source of error here is that the variable type: double, integer, 
    short, ... has to match what the fnout.nc file header expects.
    
    nx = number of longitude points
    ny = number of latitude points
    nz = number of vertical points
    nt = number of time points
    
    fdat = ncread( fnin.nc, variablename, [ 1 1 1 icnt ], [ nx ny nz 1 ] );
    reads a data volume for the icnt'th time
    Note: fdat is dimensioned
    * fdat(lon,lat,level,time) inside the MATLAB session
    * fdat(time,level,lat,lon) in ncdump output (outside of the matlab session)
    
    ncwrite( fnout.nc, variablename, fdat );
    writes all of fdat 
    
    ncwrite( fnout.nc, variablename, fdat, [ 1 1 1 icnt ] );
    where fdat is dimensioned fdat(lon,lat,level) inside the Matlab session
    
    ncwrite( fnout.nc, variablename, fdat(icnt,:,:,:), [ 1 1 1 icnt ] );
    also works, where fdat is dimensioned fdat(time,lon,lat,level).
    
    
    Example 2:  Writing a 4-dimensional variable into a file from which there is no other file to copy header information from.
    
    A script that does much of the following.  Save yourself some typing !
    
    
    nccreate(   'fnout.nc', 'w', 'Dimensions', { 'x', 211, 'y', 121, 'level', ...
    50, 'time', inf } );
    
    If the variable has missing values, you define the FillValue in nccreate: " 'FillValue', value ". Do not use NAN as the FillValue. The Matlab documentation uses NaN, but netCDF operators (NCOs) may or may not interpret the NAN correctly. Instead use a number for the missing value that won't be found in your data.
    ncwrite(    'fnout.nc', 'w', w );
    ncwriteatt( 'fnout.nc', 'w', 'long_name', 'vertical velocity' );
    ncwriteatt( 'fnout.nc', 'w', 'units', 'Pa/s' );
    
    nccreate(   'fnout.nc', 'lon', 'Dimensions', { 'x', 211, 'y', 121 }, ...
      'DataType', 'single' );
    ncwrite(    'fnout.nc', 'lon', xgrid2 );    % "xgrid2" is the 2-dimensional matrix of longitudes
    ncwriteatt( 'fnout.nc', 'lon', 'long_name', 'longitude' );
    ncwriteatt( 'fnout.nc', 'lon', 'units', 'degrees_east' );
    
    nccreate(   'fnout.nc', 'lat', 'Dimensions', { 'x', 211, 'y', 121 }, ...
      'DataType', 'single' );
    ncwrite(    'fnout.nc', 'lat', ygrid2 );    % "ygrid2" is the 2-dimensional matrix of latitudes 
    ncwriteatt( 'fnout.nc', 'lat', 'long_name', 'latitude' );
    ncwriteatt( 'fnout.nc', 'lat', 'units', 'degrees'_north );
    
    nccreate(   'fnout.nc', 'time', 'Dimensions', { 'time', inf } );
    ncwrite(    'fnout.nc', 'time', time_value );   % a vector of time values relative to a reference time 
    ncwriteatt( 'fnout.nc', 'time', 'long_name', 'time' );
    ncwriteatt( 'fnout.nc', 'time', 'units', ...
      'seconds since 1970-01-01 00:00:00.0 0:00' );
    
    

    
    
    Sophisticated

    The netCDF files written written to the COARDS standard include a time variable (year, month, day, hour, ...), which is nice to have and time consuming to set up. The rest of this page describes libraries that enable time and the other variables to be read more easily.

    Your MATLAB code will need both the MexCDF and MexEPS libraries to run these scripts. I apologize for this inconvenience, but I am a scientist and not a programmer, so I am forced to use everyone elses libraries. Use the following code at your own risk.

    The scripts comply with the netCDF standard of the Cooperative Ocean-Atmosphere Research Data Service (COARDS). The metadata information (header) for a simple file written to COARDS specifications is here.

    The COARDS standard is for time to be written as "so many units of time since some reference time." Writing and reading time in this way complies with the udunits standard, and keeping track of leap years and historical changes in calendars requires the MexEPS libraries, which are expertly managed by Willa Zhu of NOAA PMEL. The "nc_read.m" script described below will also read files where time is written as "yyyymmddhhmmss" (year, month, day, ...), which is an older in-house convention. The "nc_write.m" script writes the time variable in the COARDS fashion.

    Reading a map:
    [ data, i4, r4, nam, xgrid, ygrid ] = nc_read( filename, record);

    where:
    "data" is a vector of the data,
    "i4, r4" are vectors containing metadata: number of latitude and longitude points, first latitude point, etc.,
    "nam" (optional) variable name
    "xgrid, ygrid" (optional) vectors of longitude and latitude points, respectively
    "filename" name of file to be read in
    "record" number of the record to be read in

    The header of the MATLAB script explains these variables in more detail.

    nc_read.m -- the MATLAB script.

    Writing a map:
    nc_write( filename, record, data, nam, i4, r4 );

    where
    "filename" is the output filename,
    "record" is the record number to write,
    "data" is a vector containing the data,
    "nam" a variable name or units information,
    "i4, r4" are vectors containing metadata: number of latitude and longitude points, first latitude point, etc.,

    The header of the MATLAB script explains these variables in more detail.

    nc_write.m -- the MATLAB script.

    -->


    
    
    Converting 3-dimensional arrays to 2-dimensions

    Matlab now handles arrays with greater than 2 dimensions, which is useful for calculating means along latitude or longitude circles or spatial differences. The great majority of calculations, however, treat each grid point as an independent timeseries, and it is simpler to manipulate the data as a two-dimensional matrix, A(t,x). The following 2 lines will reduce an input 3-dimensional array to 2-dimensions.

    Consider A = A(nt,ny,nx) where nt, ny, nx are the number of time, latitude, and longitude points, respectively.
    A = permute( A, [ 3 2 1 ] );
    A = squeeze( reshape( A, nx*ny,nt,1 ) )' ;
    "A" will be dimensioned A(nt,nx*ny), and the second index will first span the first latitude circle and then the second latitude circle, and so forth.

    threetotwo.m -- the MATLAB script.


    
    
    Filling partial years of data in arrays and time series

    For some reason, datasets and time series are still written for partial years of data. In a world of gigabytes and terabytes, people are still worried about bytes and kilobytes. The following script will append missing values ("NaN"s) to time series or data arrays to make a complete year of data. The script can handle monthly or other time resolution data (but not Leap Days for now --- it would be easy to do).

    fill_year.m -- the MATLAB script.


    
    
    Saving data in ascii format

    It is sometimes useful to save data in ascii format, and this is especially true for climate indices where you might want to look at values for selected periods without having to ingest data into software.

    As of version R2012b, and possibly a little before, this is accomplished in Matlab with

    save filename.ascii -ascii variable_name(s)

    where "variable_name(s)" have to be in double precision. The error message for an incorrect variable type is cryptic.


    
    
    Interpolating 3-dimensional data

    Interpolating 3-dimensional data is straightforward, but having an example to follow can save you a lot of time!

    I read in a dataset whose ncdump has a variable to interpolate, A = A(time,level,y,x), lon = lon(y,x), lat = lat(y,x), and level = level(z). The dataset I read is not an equal angle dataset.

    Read in a 3-dimensional volume for the icnt'th time of the 4-dimensional dataset.

    fdat = ncread( filename, A, [ 1 1 1 icnt ], [ nx ny nz 1 ] );
    whos fdat
    % nx * ny * nz
    
    % griddata wants vector input
    lon2 = reshape( lon, nx*ny, 1 );
    lon2 = reshape( lon2*ones(1,nz), ny*nx*nz, 1 );
    lat2 = reshape( lat, nx*ny, 1 );
    lat2 = reshape( lat2*ones(1,nz), ny*nx*nz, 1 );
    zgrid2 = reshape( ones(nx*ny,1)*level(1:nz)', ny*nx*nz, 1 );
    fdat2  = reshape( fdat, nx*ny*nz, 1, 1 );
    
    % Define the output grid.  These variables are in 3 dimensions.
    [ lon3, lat3, zgrid3 ] = meshgrid( [ -126+1/6/2:1/6:-113-1/6/2 ], ...
      [ 43-1/6/2:-1/6:31+1/6/6 ], level(1:nz) );
    % ny * nx * nz
    
    fdat3 = griddata( lon2, lat2, zgrid2, fdat2, lon3, lat3, zgrid3 );
    whos fdat3
    % ny * nx * nz
    
    
    
    

    September 2013
    Todd Mitchell ( mitchell@atmos.washington.edu )
    /home/disk/margaret/jisao/data_sets
    JISAO data