Keras: Deep learning for humans.

Tutorial on using Keras flow_from_directory and generators

The directory structure for a binary classification problem
train_generator = train_datagen.flow_from_directory(
directory=r"./train/",
target_size=(224, 224),
color_mode="rgb",
batch_size=32,
class_mode="categorical",
shuffle=True,
seed=42
)
  • The directory must be set to the path where your ‘n’ classes of folders are present.
  • The target_size is the size of your input images, every image will be resized to this size.
  • color_mode: if the image is either black and white or grayscale set “grayscale” or if the image has three color channels, set “rgb”.
  • batch_size: No. of images to be yielded from the generator per batch.
  • class_mode: Set “binary” if you have only two classes to predict, if not set to“categorical”, in case if you’re developing an Autoencoder system, both input and the output would probably be the same image, for this case set to “input”.
  • shuffle: Set True if you want to shuffle the order of the image that is being yielded, else set False.
  • seed: Random seed for applying random image augmentation and shuffling the order of the image.
valid_generator = valid_datagen.flow_from_directory(
directory=r"./valid/",
target_size=(224, 224),
color_mode="rgb",
batch_size=32,
class_mode="categorical",
shuffle=True,
seed=42
)
  • Same as train generator settings except for obvious changes like directory path.
test_generator = test_datagen.flow_from_directory(
directory=r"./test/",
target_size=(224, 224),
color_mode="rgb",
batch_size=1,
class_mode=None,
shuffle=False,
seed=42
)
  • directory: path where there exists a folder, under which all the test images are present. For example, in this case, the images are found in /test/test_images/
  • batch_size: Set this to some number that divides your total number of images in your test set exactly.
    Why this only for test_generator?
    Actually, you should set the “batch_size” in both train and valid generators to some number that divides your total number of images in your train set and valid respectively, but this doesn’t matter before because even if batch_size doesn’t match the number of samples in the train or valid sets and some images gets missed out every time we yield the images from generator, it would be sampled the very next epoch you train.
    But for the test set, you should sample the images exactly once, no less or no more. If Confusing, just set it to 1(but maybe a little bit slower).
  • class_mode: Set this to None, to return only the images.
  • shuffle: Set this to False, because you need to yield the images in “order”, to predict the outputs and match them with their unique ids or filenames.

Let’s Train, evaluate and predict!

Fitting/Training the model

STEP_SIZE_TRAIN=train_generator.n//train_generator.batch_size
STEP_SIZE_VALID=valid_generator.n//valid_generator.batch_size
model.fit_generator(generator=train_generator,
steps_per_epoch=STEP_SIZE_TRAIN,
validation_data=valid_generator,
validation_steps=STEP_SIZE_VALID,
epochs=10
)

Evaluate the model

model.evaluate_generator(generator=valid_generator,
steps=STEP_SIZE_VALID)

Predict the output

STEP_SIZE_TEST=test_generator.n//test_generator.batch_size
test_generator.reset()
pred=model.predict_generator(test_generator,
steps=STEP_SIZE_TEST,
verbose=1)
  • You need to reset the test_generator before whenever you call the predict_generator. This is important, if you forget to reset the test_generator you will get outputs in a weird order.
predicted_class_indices=np.argmax(pred,axis=1)
labels = (train_generator.class_indices)
labels = dict((v,k) for k,v in labels.items())
predictions = [labels[k] for k in predicted_class_indices]
filenames=test_generator.filenames
results=pd.DataFrame({"Filename":filenames,
"Predictions":predictions})
results.to_csv("results.csv",index=False)

--

--

--

Machine Learning Enthusiast

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Tensorflow: train dataset by epochs or steps?

TensorFlow Quantum is an Open Source Stack that Show Us how the Future of Quantum and Machine…

Cracking the Enigma — one Convolution Neural Network at a time!

Activation Functions in Neural Networks

How I actually into the object detection

KNN( K Nearest Neighbour) Classification Algorithm

Churn Prediction using PySpark.

Neural Networks for Beginners: Popular Types and Applications

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Vijayabhaskar J

Vijayabhaskar J

Machine Learning Enthusiast

More from Medium

How to Make Your Own Computer Vision Model With Little to No Experience

PyTorch and Tensorflow in Natural Language Processing Pipeline_Model Training

Write your first Multi-node GPU training script with PyTorch using SLURM and Singularity.

IBM super computer