How to flatten an xarray dataset into a 1D numpy array?

Question

Is there a simple way of flattening an xarray dataset into a single 1D numpy array?

For example, flattening the following test dataset:

xr.Dataset({
    'a' : xr.DataArray(
                   data=[10,11,12,13,14],
                   coords={'x':[0,1,2,3,4]},
                   dims={'x':5}
          ),
    'b' : xr.DataArray(data=1,coords={'y':0}),
    'c' : xr.DataArray(data=2,coords={'y':0}),
    'd' : xr.DataArray(data=3,coords={'y':0})
})

to

[10,11,12,13,14,1,2,3]

?

You could try casting your Dataset to a dict with the to_dict() method then parsing it like a normal dictionary for the 'data' values of each data_vars key but I'm not sure that's the fastest way to do it. — BoboDarph, Oct 31 '17 at 12:52

score 6 · Answer 1 · answered Oct 31 '17 at 18:14

If you're OK with repeated values, you can use .to_array() and then flatten the values in NumPy, e.g.,

>>> ds.to_array().values.ravel()
array([10, 11, 12, 13, 14,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2,  3,  3,
        3,  3,  3])

If you don't want repeated values, then you'll need to write something yourself, e.g.,

>>> np.concatenate([v.values.ravel() for v in ds.data_vars.values()])
array([10, 11, 12, 13, 14,  1,  2,  3])

More generally, this sounds somewhat similar to a proposed interface for "stacking" data variables in 2D for machine learning applications: https://github.com/pydata/xarray/issues/1317

Thanks for the link! This is exactly what I'm trying to do. – user7821537 Nov 01 '17 at 19:25 — user7821537, Nov 01 '17 at 19:25

score 5 · Accepted Answer · answered Jul 11 '19 at 10:27

5

As of July 2019, xarray now has the functions to_stacked_array and to_unstacked_dataset that perform this function.

answered Jul 11 '19 at 10:27

user7821537

131
2
7

tsherwen · Answer 3 · 2017-11-13T12:12:34.670

Get Dataset from question:

ds = xr.Dataset({
'a' : xr.DataArray(
               data=[10,11,12,13,14],
               coords={'x':[0,1,2,3,4]},
               dims={'x':5}
      ),
'b' : xr.DataArray(data=1,coords={'y':0}),
'c' : xr.DataArray(data=2,coords={'y':0}),
'd' : xr.DataArray(data=3,coords={'y':0})
})

Get the list of data variables:

variables = ds.data_vars

Use the np.flatten() method to reduce arrays to 1D:

arrays = [ ds[i].values.flatten() for i in variables ]

Then expand list of 1D arrays (as detailed in this answer):

arrays = [i for j in arrays for i in j  ]

Now convert this to an array as requested in Q (as currently a list):

array = np.array(arrays)

How to flatten an xarray dataset into a 1D numpy array?

3 Answers3