How to set the number of images in ParallelMapDataset datatype in Tensorflow?

0

Issue

dataset, info = tfds.load('oxford_iiit_pet:3.*.*', with_info=True)

train_images = dataset['train']

test_images = dataset['test']

train_batches = ( 
    train_images
    .cache()
    .shuffle(BUFFER_SIZE)
    .batch(BATCH_SIZE)
    .prefetch(buffer_size=tf.data.AUTOTUNE))

test_batches = test_images.batch(BATCH_SIZE)

Now I would like to reduce the test_images size to 100 images.
I am expecting some code like:

test_images = test_images[100]

But this would give an error:

'ParallelMapDataset' object is not subscriptable

Solution

With take() method you can take batches or items from the target dataset.

If dataset is batched:

test_images.take((100 // BATCH_SIZE) + 1)

When you batch the dataset, it will contain batches or groups.

So let’s say, you batch your data with a size 32, test_images.take(1) will return 32 elements, in other words a single batch. test_images.take(2) will return 64 elements etc.


If it is not batched:

test_images.take(100)

Unlike batch dataset, the dataset will return the amount of elements that have passed into take() method.

Answered By – Frightera

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave A Reply

Your email address will not be published.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More