HDF is a self-describing data format that is similar to netCDF. The National Aeronautics and Space Administration (NASA) uses HDF for many of its data sets. HDF files can be written with a number of different "data models," for example, HDF is apparently a popular way to store images. I am describing the "sd" (scientific dataset) data model. The most basic information on how Matlab reads hdf can be obtained by typing
help hdf
in a Matlab session.
Open a buffer for the input file.
sd_id = hdfsd( 'start', 'filename.hdf', 'rdonly' )
Obtain information on the number of data sets and attributes in the
file.
[ndatasets,nglobal_attr,status] = hdfsd('fileinfo',sd_id)
"ndatasets" is the number of data sets. ndatasets=1 for the NASA
SeaWiFS chlorophyll data.
You next need to read the "attributes" of the file. The attributes can include information on the units of the data, how to unpack the data, the spatial domain of the data (the northernmost, southernmost, westernmost, and easternmost ranges of the data), and so forth. The number of longitude and latitude grid points is provided where you read the data. An example of how data is packed is sea-level pressure, where a value of 1020.5 mb might be stored as 2050, and you will need to divide by 100 and add 1000 to get the data value.
The HDF files, or maybe it's just the MATLAB routines to read these files, employ indices that begin with 0. This convention is associated with the C programming language. You will see that convention in the following use of "icnt" and also in "ds_start". This practice is in contrast with FORTRAN-based counting, where indices begin with 1.
The following code will read all of the attribute names and their values.
for icnt = 0: nglobal_attr-1
attribute_name = hdfsd('readattr', sd_id, icnt)
hdfsd( 'readattr', sd_id, hdfsd('findattr',sd_id,'attribute_name') )
end
Read the data set(s):
for icnt = 0: ndatasets-1
sds_id = hdfsd( 'select', sd_id, icnt )
[ds_name, ds_ndims, ds_dims, ds_type, ds_atts, stat] = hdfsd('getinfo',sds_id);
ds_start = zeros(1,ds_ndims);
ds_stride = [];
ds_edges = ds_dims;
[ds_data, status] =
hdfsd('readdata',sds_id,ds_start,ds_stride,ds_edges); % "ds_data" is the data.
end
Close the file buffer when you are finished.
hdfsd('end',sd_id);