JISAO data

MATLAB commands for reading Hierarchical Data Format (HDF) files


HDF is a self-describing data format that is similar to netCDF. The National Aeronautics and Space Administration (NASA) uses HDF for many of its data sets. HDF files can be written with a number of different "data models," for example, HDF is apparently a popular way to store images. I am describing the "sd" (scientific dataset) data model. The most basic information on how Matlab reads hdf can be obtained by typing help hdf in a Matlab session.

Open a buffer for the input file.
sd_id = hdfsd( 'start', 'filename.hdf', 'rdonly' )

Obtain information on the number of data sets and attributes in the file.
[ndatasets,nglobal_attr,status] = hdfsd('fileinfo',sd_id)
"ndatasets" is the number of data sets. ndatasets=1 for the NASA SeaWiFS chlorophyll data.

You next need to read the "attributes" of the file. The attributes can include information on the units of the data, how to unpack the data, the spatial domain of the data (the northernmost, southernmost, westernmost, and easternmost ranges of the data), and so forth. The number of longitude and latitude grid points is provided where you read the data. An example of how data is packed is sea-level pressure, where a value of 1020.5 mb might be stored as 2050, and you will need to divide by 100 and add 1000 to get the data value.

The HDF files, or maybe it's just the MATLAB routines to read these files, employ indices that begin with 0. This convention is associated with the C programming language. You will see that convention in the following use of "icnt" and also in "ds_start". This practice is in contrast with FORTRAN-based counting, where indices begin with 1.

The following code will read all of the attribute names and their values.


for icnt = 0: nglobal_attr-1
attribute_name = hdfsd('readattr', sd_id, icnt)
hdfsd( 'readattr', sd_id, hdfsd('findattr',sd_id,'attribute_name') )
end

For the data I was reading (SeaWifs chlorophyll), the attributes were written as single precision numbers, and I needed to convert them to double precision in order to do math on them. If you find that Matlab complains when you try even the simplest mathematical manipulation, this is most likely the problem.

Read the data set(s):


for icnt = 0: ndatasets-1
sds_id = hdfsd( 'select', sd_id, icnt )
[ds_name, ds_ndims, ds_dims, ds_type, ds_atts, stat] = hdfsd('getinfo',sds_id);
ds_start = zeros(1,ds_ndims);
ds_stride = []; 
ds_edges = ds_dims;
[ds_data, status] =
hdfsd('readdata',sds_id,ds_start,ds_stride,ds_edges); % "ds_data" is the data.
end

Close the file buffer when you are finished.
hdfsd('end',sd_id);



November 2010
Todd Mitchell ( mitchell@atmos.washington.edu )
JISAO data