How to append numpy array to numpy array of different size?

Question

I have 2 arrays to concatenate:

X_train's shape is (3072, 50000) y_train's shape is (50000,)

I'd like to concatenate them so I can shuffle the indices all in one go. I have tried the following, but neither works:

np.concatenate([X_train, np.transpose(y_train)])
np.column_stack([X_train, np.transpose(y_train)])

How can I concatenate them?

Concatenate to what? You got input-dimensions, what output-dimension do you want? (from a ML-perspective i don't see this making sense) — sascha, Feb 05 '18 at 16:37
@DavidG Yes, thanks! Btw, why do I get (50000,) in the first place? Is that a numpy array? Seems like it's some kind of vector or list, idk. I'm new to numpy — Nathan, Feb 05 '18 at 16:41
[This post](https://stackoverflow.com/questions/22053050/difference-between-numpy-array-shape-r-1-and-r) might help with the difference between the two — DavidG, Feb 05 '18 at 16:45
In `numpy` 1-d arrays are just as useful as 2-d (or higher). — hpaulj, Feb 05 '18 at 17:06
@DavidG If I could upvote that [this post](https://stackoverflow.com/questions/22053050/difference-between-numpy-array-shape-r-1-and-r) comment 10 times, I would. I don't know if I would have known how to find that post without your help. Perhaps I should add some tags to it to make it easier to dig up — Nathan, Feb 05 '18 at 18:59

score 1 · Accepted Answer · answered Feb 05 '18 at 16:46

To give you some recommendation targeting the task, not your problem: don't do this!

Assuming X are your samples / observations, y are your targets:

Just generate a random-permutation and create views (nothing copied or modified) into those, e.g. (untested):

import numpy as np

X = np.random.random(size=(50000, 3072))
y = np.random.random(size=50000)

perm = np.random.permutation(X.shape[0])  # assuming X.shape[0] == y.shape[0]
X_perm = X[perm]  # views!!!
y_perm = y[perm]

Reminder: your start-shapes are not compatible to most python-based ml-tools as the usual interpretation is:

first-dim / rows: samples
second-dim / cols: features

As #samples need to be the same as #target-values y, you will see that my example is correct in regards to this, while yours need a transpose on X

score 0 · Answer 2 · answered Feb 05 '18 at 16:39

As DavidG said, I realized the answer is that y_train has shape (50000,) so I needed to reshape it before concat-ing

np.concatenate([X_train,         
     np.reshape(y_train, (1, 50000))])

Still, this evaluated very slowly in Jupyter. If there's a faster answer, I'd be grateful to have it

How to append numpy array to numpy array of different size?

2 Answers2