Documentation of Jozef_help


Global Index (short | long) | Local contents | Local Index (short | long)


Help text

  For large data sets - we'll assume that the data is in a matrix called
  'sst', with 'lat', 'lon', and 'time' vectors describing the data.  In
  MATLAB 5, this might be a three dimensional matrix, with time, lat, and
  lon being the order of the dimensions.  Let's also assume that the
  annual cycle has been removed from the data, or something like that.
  If you don't remove the annual cycle, or the mean, then the first 
  EOF should be just the annual cycle, or the mean, and is probably 
  meaningless when thinking about variability.

Listing of script Jozef_help



[ntim, nlat, nlon] = size(sst);

%  For most applications, the variance at each point should be area
%  weighted by the area at each point, or by the cosine of
%  latititude.  An easier way to do this is to weight the data (sst) by
%  square root of the cosine of latitude, which will work out
%  when the covariance matrix is formed:

%  Shift the order of dimensions, and reshape the data:
sst = shiftdim(sst, 1);
sst = reshape(sst, nlat, nlon*ntim);

%  Make a column vector out of lat, if it isn't already   
lat = lat(:);

%  Weight sst
sst = sst .* (sqrt(cos((pi/180)*lat))*ones(1, nlon*ntim));

%  Reshape SST again
sst = reshape(sst, nlat, nlon, ntim);
sst = shiftdim(sst, 2);

%  Now, decide what method you want to use - they all produce the
%  same results, but some are quicker and easier.

%  1.  Singular Value Decomposition (not the same as MCA!) - this 
%      is usually used for small matrices (depending on how fast a
%      machine you have)

%  First, reshape data, again, time vs. space (2D)
sst = reshape(sst, ntim, nlat*nlon);

%  Now, decompose the sst matrix into its principal components (time),
%  and spatial loadings (space, also called EOFs) 
[pcs1, sqrtlam1, lds1] = svd(sst);

%  In MATLAB, pcs and lds are scaled to have unit length.  The 
%  square root of the eigenvalues are contained in the diagonal of
%  the matrix sqrtlam.  So, to get the actual eigenvalues, or even
%  better, the percent variance explained by each mode, do this:
lam1 = diag(sqrtlam).^2;
per1 = 100*lam1 ./ sum(lam1);

%  Note that, for the above method, the following produces identical
%  results:
[lds1a, sqrtlam1a, pcs1a] = svd(sst');

%  I usually don't use 'lds' to look at the data.  Instead, reconstruct
%  your original sst field (this time without weighting by cosine) and
%  regress the unweighted field onto 'pcs', where each principal 
%  component has unit standard deviation.  That way you can look at the
%  standard deviation at each point associated with a particular mode.

%  2.  Eigenvector analysis - for very large data sets.

%  Suppose the matrix sst has ten thousand years, 15 latitude points,
%  and 30 longitude points.  Then it is a 10000X450 element matrix.
%  This will take a long time using the 'svd' command above.  Instead,
%  form the covariance matrix

c = sst'*sst ./ ntim;  %  Note that the variance is now weighted
                       %  by the cosine of latitude

%  Perform eigenvector analysis:
[lds2, lam2] = eig(c);
lam2 = diag(lam2);

%  Sort in decending order of importance
[lam2, k] = sort(lam2'); lds2 = lds2(:,k);
lam2 = fliplr(lam2); lds2 = fliplr(lds2);
per2 = 100*lam2 ./ sum(lam2);
pcs2 = sst*lds2;

%  I usually scale the pcs and lds as stated above:
wgt = std(pcs2);
for i = 1:(ntim);
  pcs2(:,i) = pcs2(:,i)./wgt(i);
  lds2(:,i) = lds2(:,i).*wgt(i);
end


%  This second method should produce identical results to the first
%  method.  Also, if you have a 450X10000 matrix, or something,
%  it doesn't matter how you form the covariance matrix:
c = sst*sst'./(nlat*nlon);
[pcs2a, lam2a] = eig(c);
lam2a = diag(lam2a);
[lam2a, k] = sort(lam2a'); pcs2a = pcs2a(:,k);
lam2a = fliplr(lam2a); pcs2a = fliplr(pcs2a);
per2a = 100*lam2a ./ sum(lam2a);
lds2a = pcs2a'*sst;

%  Again, scale the results appropriately.


%  MCA analysis uses the SVD routine, but is used for a completely 
%  different reason.  MCA used when you want to find patterns of 
%  covariance between two fields, say, sst and hgt500, which both
%  have the same number of time points, and have been weighted
%  appropriately.

%  Form the covariance matrix, which should have as many rows as 
%  hgt has spatial points, and as many columns as sst has spatial
%  points.

c = hgt'*sst./ntim;

[hgtu, s, sstv] = svd(c);
hgtx = hgt*hgtu;
ssty = sst*sstv;

%  Now, hgtu and sstv are the homogeneous spatial patterns, and
%  hgtx and ssty are the time series accompanying these spatial 
%  patterns.  More interesting are the heterogeneous spatial
%  patterns, which result from regressing hgt onto ssty, and
%  sst onto hgtx.

%  A great analysis is found in Dennis Hartmann's webpage, listed 
%  above.