4

I'd like to take raster (satellite imagery) data, and build a Dataset or DataArray, to speed up my image processing (I have to work on multi-band, multi-date satellite imagery a lot).

The data comes as individual bands for each image date, and I understand how to convert each band-date to an xarray-DataArray. I assume it'd make most sense to have one variable for each band, and within each band have the spatial (x, y) and time dimensions.

However, I can't figure out how to do that.

I've been working with some dummy bands to try to figure this out, so will include that to clarify what my data looks like and what I'm trying to do.

# Set up dummy 3 x 3 array
dA = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Create 4 dummy images; 2 bands for each of 2 dates (using bands 4 and 5,
# because they're useful for vegetation measures)
d1_b4 = xr.DataArray((dA + 140),
    coords={'x': ['1', '2', '3'], 'y': ['a', 'b', 'c']}, dims=('x', 'y'))
d1_b5 = xr.DataArray((dA + 150),
    coords={'x': ['1', '2', '3'], 'y': ['a', 'b', 'c']}, dims=('x', 'y'))
d2_b4 = xr.DataArray((dA + 240),
    coords={'x': ['1', '2', '3'], 'y': ['a', 'b', 'c']}, dims=('x', 'y'))
d2_b5 = xr.DataArray((dA + 250),
    coords={'x': ['1', '2', '3'], 'y': ['a', 'b', 'c']}, dims=('x', 'y'))
     # dummy values designed so I can keep track of which array is going
     # where while I learn this

Then I want to combine these into one DataArray, with two variables (Band4 and Band5), each containing the two image dates... but don't know how to proceed.

Do I need to add more coordinates, or dimensions when I create/import the arrays, and then concat along those dimensions?

KenC
  • 43
  • 4
  • Have you looked at the docs for loading rasters into xarray from rasterio: http://xarray.pydata.org/en/stable/io.html#rasterio? It's not clear to me why you want to construct the multi-band array from scratch like this. – jhamman Jan 11 '18 at 01:12

1 Answers1

2

As mentioned by jhamman, a lot depends on where you data comes from to determine how to combine it. This is one way of combining the data you've posed, but there are other approaches.

There are multiple steps needed to combine this data. First, name each of the DataArrays, with the name of the variable you want it to end up in.

d1_b4.name = 'band4'
d1_b5.name = 'band5'
d2_b4.name = 'band4'
d2_b5.name = 'band5'

Then use xr.merge to put them into xarray.Datasets. A Dataset contains multiple xarray.DataArrays, which can share some or all of their dimensions.

d1 = xr.merge([d1_b4, d1_b5])
d2 = xr.merge([d2_b4, d2_b5])

<xarray.Dataset>
Dimensions:  (x: 3, y: 3)
Coordinates:
  * x        (x) <U1 '1' '2' '3'
  * y        (y) <U1 'a' 'b' 'c'
Data variables:
    band4    (x, y) int64 241 242 243 244 245 246 247 248 249
    band5    (x, y) int64 251 252 253 254 255 256 257 258 259

Finally, to combine the data from different dates. We need a new dimension time, with coordinate values for each date. We can do this in a single step using xr.concat.

xr.concat([d1, d2], dim=pd.Index([1990, 1991], name='time'))

<xarray.Dataset>
Dimensions:  (time: 2, x: 3, y: 3)
Coordinates:
  * x        (x) <U1 '1' '2' '3'
  * y        (y) <U1 'a' 'b' 'c'
  * time     (time) int64 1990 1991
Data variables:
    band4    (time, x, y) int64 141 142 143 144 145 146 147 148 149 241 242 ...
    band5    (time, x, y) int64 151 152 153 154 155 156 157 158 159 251 252 ...
Damien Ayers
  • 1,166
  • 7
  • 16
  • Thanks Damien! Wonderful answer. With an explanation like this, I wish you were writing the xarray documentation. – KenC Jan 11 '18 at 02:45