Questions tagged [tf.data.dataset]

80 questions
4
votes
1 answer

How exactly does tf.data.Dataset.interleave() differ from map() and flat_map()?

My current understanding is: Different map_func: Both interleave and flat_map expect "A function mapping a dataset element to a dataset". In contrast, map expects "A function mapping a dataset element to another dataset element". Arguments: Both…
gebbissimo
  • 1,136
  • 2
  • 14
  • 24
3
votes
0 answers

how to access tf.data.Dataset within a keras custom callback?

I have written a custom keras callback to check the augmented data from a generator. (See this answer for the full code.) However, when I tried to use the same callback for a tf.data.Dataset, it gave me an error: File…
craq
  • 1,006
  • 2
  • 15
  • 34
3
votes
0 answers

Tensorflow 2 - AttributeError: '_NestedVariant' object has no attribute 'batch'

In Chapter 17 of the book "Hands on machine learning with scikit-learn and tensorflow 2.0", they split a sequential dataset into multiple windows by using tf.data.Dataset and the window() method: n_steps = 100 window_length = n_steps + 1 # target =…
ebeninki
  • 638
  • 5
  • 18
2
votes
1 answer

Tensorflow DataSet Shuffle Impact the validation training accuracy and ambiguous behavior

i am struggling with training a neural network that uses tf.data.DataSet as input. What I find is that if I call .shuffle() before split the entire dataset in train, val, test set the accuracy on val (in training) and test (in evaluate) is 91%, but…
2
votes
1 answer

How to efficiently feed data into TensorFlow 2.x,

I am looking at a data preprocessing task on a large amount of text data and want to load the preprocessed data into TensorFlow 2.x. The preprocessed data contains arrays of integer values since the preprocessing step generates: a one hot encoded…
user8276908
  • 991
  • 6
  • 20
2
votes
2 answers

Difference between tf.data.Datasets.repeat(EPOCHS) vs model.fit epochs=EPOCHS

While training, I set epochs to number of times to iterate over the data. I was wondering what is the use of tf.data.Datasets.repeat(EPOCHS) when I can already do the same thing with model.fit(train_dataset,epochs=EPOCHS)?
2
votes
1 answer

How to use tf.data.Dataset with kedro?

I am using tf.data.Dataset to prepare a streaming dataset which is used to train a tf.kears model. With kedro, is there a way to create a node and return the created tf.data.Dataset to use it in the next training node? The MemoryDataset will…
evolved
  • 1,071
  • 12
  • 28
2
votes
1 answer

Does kedro support tfrecord?

To train tensorflow keras models on AI Platform using Docker containers, we convert our raw images stored on GCS to a tfrecord dataset using tf.data.Dataset. Thereby the data is never stored locally. Instead the raw images are transformed directly…
2
votes
1 answer

extracting numpy value from tensorflow object during transformation

i am trying to get word embeddings using tensorflow, and i have created adjacent work lists using my corpus. Number of unique words in my vocab are 8000 and number of adjacent word lists are around 1.6 million Word Lists sample photo Since the data…
1
vote
0 answers

Specifying class or sample weights in Keras for one-hot encoded labels in a TF Dataset

I am trying to train an image classifier on an unbalanced training set. In order to cope with the class imbalance, I want either to weight the classes or the individual samples. Weighting the classes does not seem to work. And somehow for my setup I…
1
vote
1 answer

How to create a tf.data pipeline with multiple .npy files

I have looked into other issues on this problem but could not find the exact answer, so trying from scratch: The problem I have multiple .npy files (X_train files) each an array of shape (n, 99, 2) - only the first dimension differs, while the…
1
vote
1 answer

Tensorflow 2.3 pipeline load all the data to the RAM

I created pipeline using tf.data API, for reading data set of images. I have a big dataset with high resolution. However, each time trying to reading all the dataset, the computer crash because the code using all the RAM. I tested the code with…
1
vote
0 answers

Using tensorflow.data to generate dataset of images and multiple labels

I am trying to train a neural network to draw a bounding box around an object. I have generated the data myself, 256x256 rgb images and five labels per image (two corners of bounding box + a rotational component). In order to not run out of memory…
1
vote
1 answer

How can I remove or omit data using map method for tf.data.Dataset objects?

I am using tensorflow 2.3.0 I have a python data generator- import tensorflow as tf import numpy as np vocab = [1,2,3,4,5] def create_generator(): 'generates a random number from 0 to len(vocab)-1' count = 0 while count < 4: x…
1
vote
1 answer

Sequential model with tensorflow dataset

I tried to understand how to use tensorflows Datasets for a simple regression model, instead of feeding it with a separate np.array for training input and output. Here a simple standalone example: import tensorflow as tf import numpy as np # create…
Max86
  • 45
  • 4
1
2 3 4 5 6