Data handling

Downloading and processing MODIS with fast and free tools

Here’s a simple way of processing MODIS: downloading, extracting the bands from HDF as GEOTIFF, merging tiles and reprojecting. Area extraction, NDVI calculation, cloud filtering and more can be done in GRASS for example. It’s Linux based, but can easily used in Windows. My way might not be the most elegant, but it’s fast and works fine.

The first step is to select the tiles, the time frame and product at http://reverb.echo.nasa.gov/reverb/. I take MOD09Q1 (250 m, 8 day, collection 6 reflectance) as example. Once orderd, we get a TXT file containing all file names. We download all files using wget. After download i mass rename all files to remove the creation date (using e.g. total commander) so the files are named like: MOD09Q1.A2000065.h21v05.006.

Then I extract Band 1, Band 2 and the Quality flag as GeoTiff. “aus” is the folder were the TIFFS will end up. Find out the name of the bands using gdalinfo.

wget -i modis.txt
aus="/media/data/MODIS/tiffs/"
for i in *.hdf; do gdal_translate -of GTiff 'HDF4_EOS:EOS_GRID:"'$i'":MOD_Grid_250m_Surface_Reflectance:sur_refl_b01' $aus`basename $i .hdf`_REF_band1.tif; done
for i in *.hdf; do gdal_translate -of GTiff 'HDF4_EOS:EOS_GRID:"'$i'":MOD_Grid_250m_Surface_Reflectance:sur_refl_b02' $aus`basename $i .hdf`_REF_band2.tif; done
for i in *.hdf; do gdal_translate -of GTiff 'HDF4_EOS:EOS_GRID:"'$i'":MOD_Grid_250m_Surface_Reflectance:sur_refl_qc_250m' $aus`basename $i .hdf`_QA.tif; done

I did not find a clean way for merging, so i suggest my “dirty” way, it needs to be repeated for each year and band, this is 5 minutes of copying but you can use it again and again and again for the next years then….here’s an example for band 1 year 2000. If you use a 16 day product every second line will give an error, but that you can ignore.

# 2000
gdal_merge.py -o 2000001_band1.tif *2000001*
gdal_merge.py -o 2000009_band1.tif *2000009*
gdal_merge.py -o 2000017_band1.tif *2000017*
gdal_merge.py -o 2000025_band1.tif *2000025*
gdal_merge.py -o 2000033_band1.tif *2000033*
gdal_merge.py -o 2000041_band1.tif *2000041*
gdal_merge.py -o 2000049_band1.tif *2000049*
gdal_merge.py -o 2000057_band1.tif *2000057*
gdal_merge.py -o 2000065_band1.tif *2000065*
gdal_merge.py -o 2000073_band1.tif *2000073*
gdal_merge.py -o 2000081_band1.tif *2000081*
gdal_merge.py -o 2000089_band1.tif *2000089*
gdal_merge.py -o 2000097_band1.tif *2000097*
gdal_merge.py -o 2000105_band1.tif *2000105*
gdal_merge.py -o 2000113_band1.tif *2000113*
gdal_merge.py -o 2000121_band1.tif *2000121*
gdal_merge.py -o 2000129_band1.tif *2000129*
gdal_merge.py -o 2000137_band1.tif *2000137*
gdal_merge.py -o 2000145_band1.tif *2000145*
gdal_merge.py -o 2000153_band1.tif *2000153*
gdal_merge.py -o 2000161_band1.tif *2000161*
gdal_merge.py -o 2000169_band1.tif *2000169*
gdal_merge.py -o 2000177_band1.tif *2000177*
gdal_merge.py -o 2000185_band1.tif *2000185*
gdal_merge.py -o 2000193_band1.tif *2000193*
gdal_merge.py -o 2000201_band1.tif *2000201*
gdal_merge.py -o 2000209_band1.tif *2000209*
gdal_merge.py -o 2000217_band1.tif *2000217*
gdal_merge.py -o 2000225_band1.tif *2000225*
gdal_merge.py -o 2000233_band1.tif *2000233*
gdal_merge.py -o 2000241_band1.tif *2000241*
gdal_merge.py -o 2000249_band1.tif *2000249*
gdal_merge.py -o 2000257_band1.tif *2000257*
gdal_merge.py -o 2000265_band1.tif *2000265*
gdal_merge.py -o 2000273_band1.tif *2000273*
gdal_merge.py -o 2000281_band1.tif *2000281*
gdal_merge.py -o 2000289_band1.tif *2000289*
gdal_merge.py -o 2000297_band1.tif *2000297*
gdal_merge.py -o 2000305_band1.tif *2000305*
gdal_merge.py -o 2000313_band1.tif *2000313*
gdal_merge.py -o 2000321_band1.tif *2000321*
gdal_merge.py -o 2000329_band1.tif *2000329*
gdal_merge.py -o 2000337_band1.tif *2000337*
gdal_merge.py -o 2000345_band1.tif *2000345*
gdal_merge.py -o 2000353_band1.tif *2000353*
gdal_merge.py -o 2000361_band1.tif *2000361*

# We can remove the other files now 
rm MOD*
# Then reproject everything to a geographic system, maybe into a new folder:
mkdir lat
aus="/media/data/MODIS/tiffs/lat/"

for i in *_band1.tif; do gdalwarp -t_srs EPSG:4326 $i $aus`basename $i .tif`.tif; done
for i in *_band2.tif; do gdalwarp -t_srs EPSG:4326 $i $aus`basename $i .tif`.tif; done
for i in ???????_QA.tif; do gdalwarp -t_srs EPSG:4326 $i $aus`basename $i .tif`.tif; done

What we have now can already be used, but you can continue in GRASS to calculate NDVI and filter the data.

 

Advertisements
Environmental change in the Sahel, Research results

Recent woody vegetation trends in Sahel

Our new paper looks at recent dynamics in woody vegetation in Sahel and finds some interesting patterns which are mainly controlled by human population density.

Martin Brandt, Pierre Hiernaux, Kjeld Rasmussen, Cheikh Mbow, Laurent Kergoat, Torbern Tagesson, Yahaya Ibrahim, Abdoulaye Wele, Compton J. Tucker, Rasmus Fensholt. Assessing woody vegetation trends in Sahelian drylands using MODIS based seasonal metrics. Remote Sensing of Environment, 2016, 183, 215-225.

  • Woody cover trends are estimated for Sahel based on MODIS dry season metrics.
  • Interannual fluctuations in foliage density are attenuated to monitor woody plant trends.
  • Increases (decreases) are seen in areas of low (high) human population.
  • Recent decreases only partially offset a general post-drought increase in Sahelian woody cover.

Woody plants play a major role for the resilience of drylands and in peoples’ livelihoods. However, due to their scattered distribution, quantifying and monitoring woody cover over space and time is challenging. We develop a phenology driven model and train/validate MODIS (MCD43A4, 500 m) derived metrics with 178 ground observations from Niger, Senegal and Mali to estimate woody cover trends from 2000 to 2014 over the entire Sahel at 500 m scale.

Over the 15 year period we observed an average increase of 1.7 (± 5.0) woody cover (%) with large spatial differences: No clear change can be observed in densely populated areas (0.2 ± 4.2), whereas a positive change is seen in sparsely populated areas (2.1 ± 5.2). Woody cover is generally stable in cropland areas (0.9 ± 4.6), reflecting the protective management of parkland trees by the farmers. Positive changes are observed in savannas (2.5 ± 5.4) and woodland areas (3.9 ± 7.3).

The major pattern of woody cover change reveals strong increases in the sparsely populated Sahel zones of eastern Senegal, western Mali and central Chad, but a decreasing trend is observed in the densely populated western parts of Senegal, northern Nigeria, Sudan and southwestern Niger. This decrease is often local and limited to woodlands, being an indication of ongoing expansion of cultivated areas and selective logging.

We show that an overall positive trend is found in areas of low anthropogenic pressure demonstrating the potential of these ecosystems to provide services such as carbon storage, if not over-utilized. Taken together, our results provide an unprecedented synthesis of woody cover dynamics in the Sahel, and point to land use and human population density as important drivers, however only partially and locally offsetting a general post-drought increase.

graph_abstract

Data handling

Smoothing/Filtering a NDVI time series using a Savitzky Golay filter and R

Coarse scaled NDVI (or FAPAR, EVI…) images have a high temporal frequency and are delivered as Maximum Value Composites (MVC) of several days, which means the highest value is taken, assuming that clouds and other contaminated pixels have low values. However, especially in areas with a rainy season, the composites over 10-16 days still contain clouds and other disturbances. This figure illustrates a raw MODIS NDVI MVC of 16 days, and it’s obvious that several pixels are contaminated, and it’s also obvious that analyses will be affected by the noisy data.

unfiltered MODIS image
unfiltered MODIS image

Datasets like GIMMS or GEOV1 provide already filtered data, but e.g. MODIS, SPOT VGT, PROBA-V and AVHRR data are raw. The solution is to smooth the time series, using a filter, which calculates a smooth time series and interpolates the bad values using the previous and following images. Here is the same image, but smoothed with a Savitzky Golay filter.

filtered and smoothed MODIS image
filtered and smoothed MODIS image

The data are mostly delivered with quality rasters, rating the quality of each pixel. This can be used to either filter the raster and set bad quality pixels to  NA, or to weight the pixels. When the new time line is calculated, low weighted (i.e. contaminated) pixels are less considered in the calculation process. One possibility is the software TIMESAT (Eklundh & Jönsson, 2011), which offers different filter techniques. Here is an example how timesat smoothes the time line with a Savitzky Golay filter, omitting “bad” pixles and creating new rasters.

A Savitzky Golay filter applied on MODIS using TIMESAT
A Savitzky Golay filter applied on MODIS using TIMESAT

Filtering is also possible in R, and it’s very simple. First one has to decide if one wants to work with quality files, or simply use the raw data, both is possible. GIS software like GRASS has modules which allow an easy use of the quality files:

http://grass.osgeo.org/grass64/manuals/i.in.spotvgt.html

http://grass.osgeo.org/grass70/manuals/i.modis.qc.html

However, filtering without quality flags also provides reasonable results. Now we assume we have a time series of MODIS data from 2005-2010, with 23 images each year. This data is loaded in R in a raster stack or brick, called MODIS, bad values are masked as NA. We load the libraries, and create a function which uses the Savitzky Golay filter from the signal package. The parameters of the function need to be adapted (p, n, ts) (http://www.inside-r.org/packages/cran/signal/docs/sgolayfilt), also the time frame.

library(raster)
library(rgdal)
library(signal)

library(zoo)

fun <- function(x) {
v=as.vector(x)
z=substituteNA(v, type=”mean”)
MODIS.ts2 = ts(z, start=c(2005,1), end=c(2010,23), frequency=23)
x=sgolayfilt(MODIS.ts2, p=1, n=3, ts=30)
}

MODIS.filtered <- calc(MODIS, fun)

MODIS.filtered is a new brick containing the smoothed time series. Compare the raw with the filtered tims series:

l=cellStats(MODIS, stat=mean)
MODIS.raw.ts = ts(l, start=c(2005,1), end=c(2010,23), frequency=23)
plot(MODIS.raw.ts)
l=cellStats(MODIS.filtered, stat=mean)
MODIS.filt.ts = ts(l, start=c(2005,1), end=c(2010,23), frequency=23)
plot(MODIS.filt.ts)

One may find out the perfect fitting parameters by looking at the whole area, playing with the parameters:

l=cellStats(MODIS, stat=’mean’)
MODIS.ts = ts(l, start=2005, end=c(2010,23), frequency=23)
sg=sgolayfilt(MODIS.ts, p=3, n=9, ts=20)
sg.ts = ts(sg, start=2005, end=c(2010,23), frequency=23)

plot(sg.ts)

Eklundh, L.; Jönsson, P. Timesat 3.1 Software Manual. 2011.
Data handling

Pixel-wise regression between two raster time series (e.g. NDVI and rainfall)

Doing a pixel-wise regression between two raster time series can be useful for several reasons, for example:

  • find the relation between vegetation and rainfall for each pixel, e.g. a low correlation could be a sign of degradation
  • derive regression coefficients to model the depending variable using the independend variable (e.g. model NDVI with rainfall data)
  • check the consistency between different sensors, e.g. MODIS and SPOT VGT
  • adjust the data recorded by different sensors to each other

It’s easy and fast in R. Let’s say we have two raster stacks or bricks which contain 120 rasters each, one with NDVI and one with rainfall (named “NDVI” and “rainfall”). We put everything in one single stack and derive rasters showing the slope, intercept and R² of the two time series as pixel values:

library(raster)

s <- stack(NDVI, rainfall)
fun=function(x) { if (is.na(x[1])){ NA } else { lm(x[1:120] ~ x[121:240])$coefficients[2] }}
slope <- calc(s, fun)
fun=function(x) { if (is.na(x[1])){ NA } else { lm(x[1:120] ~ x[121:240])$coefficients[1] }}
intercept <- calc(s, fun)
fun=function(x) { if (is.na(x[1])){ NA } else { m <- lm(x[1:120] ~ x[121:240]);summary(m)$r.squared }}
r.squared <- calc(s, fun)

It only works if your two stacks have the same dimensions and NA values, if not, you can try something like that:

z = mask(NDVI, rainfall)
s <- stack(z, rainfall)

and do the analysis as described above.

This example shows R² between annually summed NDVI (SPOT VGT) and TRMM rainfall from 1998-2010 for the Linguere district in Senegal, and we see that major differences exist, which need to further examined.

R² annual sums SPOT VGT NDVI and TRMM rainfall, 1998-2010 for Linguere, Senegal
R² annual sums SPOT VGT NDVI and TRMM rainfall, 1998-2010 for Linguere, Senegal
Environmental change in the Sahel, Research results

Environmental change in time series – An interdisciplinary study in the Sahel of Mali and Senegal

Our next article is now online in the Journal of Arid Environments:

It is available here: http://www.sciencedirect.com/science/article/pii/S0140196314000536

Abstract: Climatic changes and human activities have caused major environmental change in the Sahel. Remote sensing studies detect various vegetation trends; however, explanations are rarely studied in detail. We present a methodology using time series, high-resolution imagery and fieldwork to validate trend analyses for two regions in the Sahel of Mali and Senegal. Both study areas show significant greening trends from 1982 to 2010. Reasons can be very site-specific, but several factors are valid for both research areas: (1) farmer-managed agro-forestry, (2) planting programs and protection laws, (3) widespread dispersion of robust species, which replace the former diverse woody vegetation and simulate a greening which conceals a shift in biodiversity and (4) an increase of annual rainfall. However, the situation is still far from the pre-drought conditions, which are reconstructed by Corona imagery (1965) and interviews with the local population. Rather a transformation is observed: a decrease in natural vegetation, tree density and diversity. Reasons are climatic and anthropogenic: (1) drought events, less rain and higher temperatures, (2) increased demand for cropping areas and wood, especially in times of droughts. Our example validates that climatic factors are important drivers of change, but much of today’s environment and vegetation composition is controlled by humans.

Data handling

Renaming, converting, clipping: script based raster time series processing

When working with remotely sensed time series data (e.g. MODIS, AVHRR, GIMMS, etc.), bulk processing can save a lot of time. Using a terminal in a Linux environment, simple scripts can process thousends of files in short time. Here are some basic hints on how to start, gdal has to be installed. For Windows users or users who prefer an interface, QGIS provides a graphical interface for most of the gdal modules. Also renaming can be done more user friendly.

Renaming:

Original data-names can be very long annoying. So first we rename all files to numbers from 1 to XX (like 1, 2, 3…). To do so, navigate to the folder with the files, and these lines will rename all files originally starting with MOD (like MOD13Q1203837383922837284.hdf):

for fname in MOD* 
do
mv $fname `printf $x`
x=$(($x+1))
done

Then the following lines will create a script that renames all files numbered from 1 to 60. the final files will include year and date. So we then have data named MODIS_2001_01 until MODIS_2005_12. This can be adopted to every name and numbering. The script is then executed.

for ((i=1;$i<=60;i++)) do echo "mv $i MODIS_$(( ($i-1) / 12 + 2001 ))_$(( ($i-1) % 12 + 1 )).tif" ; done >rename.sh
sed -i 's/_1/_01/g' rename.sh
sed -i 's/_2/_02/g' rename.sh
sed -i 's/_3/_03/g' rename.sh
sed -i 's/_4/_04/g' rename.sh
sed -i 's/_5/_05/g' rename.sh
sed -i 's/_6/_06/g' rename.sh
sed -i 's/_7/_07/g' rename.sh
sed -i 's/_8/_08/g' rename.sh
sed -i 's/_9/_09/g' rename.sh
chmod 0755 rename.sh
./rename.sh

Convert:

This works fine to convert all kind of formats, also very useful to change HDF files (like MODIS, VGT…) to GeoTiff. This example converts GeoTiffs to ENVI format.

# define in and out folder and output format

in="/home/martin/SPOT_VGT/NDV/"
out="/home/martin/SPOT_VGT/NDV/converted/"
format="ENVI"

# convert all Tiffs to ENVI format, use float32 as output here

cd $in
for i in *.tif; do gdal_translate -of $format -ot Float32 $i $out`basename $i`; done

Clip:

# define in and out folder

in="/home/martin/SPOT_VGT/NDV/"
out="/home/martin/SPOT_VGT/NDV/clipped/"

# define the window which should be clipped

range="-17.6 16.25 -2.72 13.16"

# do the clipping for all Tiffs

cd $in 
for i in *.tif; do gdal_translate -projwin $range $i $out`basename $i`; done

For more information, see http://www.gdal.org/

Data handling

Pixel-wise time series trend anaylsis with NDVI (GIMMS) and R

The GIMMS dataset is currently offline and the new GIMMS3g will soon be released, but it does not really matter which type of data is used for this purpose. It can also be SPOT VGT, MODIS or anything else as long as the temporal resolution is high and the time frame is long enough to detect significant trends. The purpose is to do a pixelwise trend analyis and extract only significant trends over a certain period of time fora selected region. Everything is done using open and free R software. Input data are continuous NDVI images, in this case it’s GIMMS with bi-monthly images, so 24 per year for the period 1982-2006.

so let’s load all Geotiffs lying in the GIMMS directory (should be  600 in this case):

library(raster)
setwd("~/GIMMS")
sg = stack(list.files(pattern='*.tif'))
gimms = brick(sg)
rm(sg)

first, let’s create annual sums, as autocorrelation might be present with monthly values. To keep NDVI as the unit, the results are devided by 24:

fun <- function(x) { 
 gimms.ts = ts(x, start=c(1982,1), end=c(2006,24), frequency=24)
 x <- aggregate(gimms.ts) 
 }
gimms.sum <- calc(gimms, fun)
gimms.sum=gimms.sum/24
plot(gimms.sum)

then the slope is calculated to get the direction and magnitude of trends, multiplied by the number of years to get the change in NDVI units:

time <- 1:nlayers(gimms.sum) 
fun=function(x) { if (is.na(x[1])){ NA } else { m = lm(x ~ time); summary(m)$coefficients[2] }}
gimms.slope=calc(gimms.sum, fun)
gimms.slope=gimms.slope*25
plot(gimms.slope)

now we need to see which trends are significant. Thus we first extract the p-value:

fun=function(x) { if (is.na(x[1])){ NA } else { m = lm(x ~ time); summary(m)$coefficients[8] }}
p <- calc(gimms.sum, fun=fun)
plot(p, main="p-Value")

then mask all values >0.05 to get a confidence level of 95%:

m = c(0, 0.05, 1, 0.05, 1, 0)
rclmat = matrix(m, ncol=3, byrow=TRUE)
p.mask = reclassify(p, rclmat)
fun=function(x) { x[x<1] <- NA; return(x)}
p.mask.NA = calc(p.mask, fun)

and finaly mask all insignificant values in the trend map, so we only get NDVI change significant at the 95% level:

trend.sig = mask(gimms.slope, p.mask.NA)
plot(trend.sig, main="significant NDVI change")

The result could look like that:

significant NDVI change (1982-2006) using integrated GIMMS
significant NDVI change (1982-2006) using integrated GIMMS