I am new to Tensorflow and Keras. I have loaded a dataset from CSV and created a train_dataset as such:
column_names = ['a', 'date', 'c', 'd', 'e', 'f']
label_name = column_names[0]
feature_names = column_names[1:]
class_names = ['good', 'bad']
train_dataset = tf.data.experimental.make_csv_dataset(
train_dataset_fp,
batch_size,
column_names=column_names,
label_name=label_name,
num_epochs=1)
features, labels = next(iter(train_dataset))
print(features)
My features are an OrderedDict and print as:
OrderedDict([('b', <tf.Tensor: shape=(32,), dtype=int32, numpy= array([1, 1, 0, 0, 0, 1, 0, 1, 0, 2, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1], dtype=int32)>), ('date', <tf.Tensor: shape=(32,), dtype=int64, numpy= array([-9223372036855, 1262478794000, 1262426153000, 1262431717000, 1262425334000, 1262588520000, 1262425515000, 1262418072000, 1262420797000, 1262428601000, 1262590037000, 1262421322000, 1262433023000, 1262390762000, 1262590200000, 1262432769000, 1262427397000, -9223372036855, 1262425996000, 1262430050000, 1262431867000, 1262424427000, 1262420906000, 1262391208000, 1262590114000, -9223372036855, 1262589645000, 1262424306000, 1262428178000, 1262421300000, 1262423456000, 1262515569000])>), ('d', <tf.Tensor: shape=(32,), dtype=int32, numpy= array([357, 313, 557, 691, 292, 557, 605, 605, 48, 295, 81, 656, 321, 734, 584, 652, 575, 465, 71, 453, 196, 48, 689, 591, 676, 271, 67, 229, 740, 713, 230, 664], dtype=int32)>), ('e', <tf.Tensor: shape=(32,), dtype=int32, numpy= array([519, 537, 610, 178, 552, 610, 240, 240, 343, 643, 481, 340, 362, 143, 511, 167, 5, 685, 436, 105, 659, 343, 427, 242, 30, 717, 531, 492, 433, 452, 645, 303], dtype=int32)>), ('f', <tf.Tensor: shape=(32,), dtype=int32, numpy= array([ 345, 545, 1663, 1426, 2065, 1017, 1655, 47, 2070, -1, 1191, 191, 1569, 547, 1295, 1776, 1620, 680, 1990, 1642, 1930, 1465, 1887, 2128, 999, 447, 844, 1851, 1586, 1742, 2079, 729], dtype=int32)>)])
As you can see one of them has dtype=int64. I then use the following function to pack the features into an array:
def pack_features_vector(features, labels):
features = tf.stack(list(features.values()), axis=1)
return features, labels
However when I run it:
train_dataset = train_dataset.map(pack_features_vector)
I get the following error:
"TypeError: Tensors in list passed to 'values' of 'Pack' Op have types [int32, int64, int32, int32, int32] that don't all match."
I understand that the issue is the stack function. I have an epoch format date as my second feature which was read in as int64. I think it may be easiest to convert all tensors to the same dType but I am not sure how. I can see that features collection is an OrderedDict of Numpy arrays but I do not know how to change dType of the items. I tried the following, it did not yeild a traceback but when I printed my features again all dtypes were still the same:
for k,v in train_dataset:
tf.dtypes.cast(v, tf.int64)
I would greatly appreciate any help. Thank you.