Food Image Classification¶


Context:¶


Image classification has become less complicated with deep learning and the availability of larger datasets and computational assets. The Convolution neural network is the most popular and extensively used image classification technique in the latest day.

Clicks is a stock photography company and is an online source of images available for people and companies to download. Photographers from all over the world upload food-related images to the stock photography agency every day. Since the volume of the images that get uploaded daily will be high, it will be difficult for anyone to label the images manually.


Objective:¶


Clicks have decided to use only three categories of food (Bread, Soup, and Vegetables-Fruits) for now, and you as a data scientist at Clicks, need to build a classification model using a dataset consisting of images that would help to label the images into different categories.


Dataset:¶


The dataset folder contains different food images. The images are already split into Training and Testing folders. Each folder has three subfolders named Bread, Soup, and Vegetables-Fruits. These folders have images of the respective classes.

Mount the Drive¶

In [2]:
from google.colab import drive
drive.mount('/content/drive')
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

Importing the Libraries¶

In [ ]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import zipfile

# For Data Visualization
import cv2
import seaborn as sns

# For Model Building
import tensorflow as tf
import keras
from tensorflow.keras.models import Sequential, Model                                                                       # Sequential API for sequential model
from tensorflow.keras.layers import Dense, Dropout, Flatten                                                                 # Importing different layers
from tensorflow.keras.layers import Conv2D, MaxPooling2D, BatchNormalization, Activation, Input, LeakyReLU, Activation
from tensorflow.keras import backend                                                                                        # This helps to clear memory to prevent overflow
from tensorflow.keras.utils import to_categorical                                                                           # To perform one-hot encoding
from tensorflow.keras.optimizers import RMSprop, Adam, SGD                                                                  # Optimizers for optimizing the model
from tensorflow.keras.callbacks import EarlyStopping                                                                        # Regularization method to prevent the overfitting - stops the model if accuracy doesnt improve after a defined number of epoch
from tensorflow.keras.callbacks import ModelCheckpoint                                                                      # Save the best model
from tensorflow.keras import losses, optimizers
from tensorflow.keras.preprocessing.image import load_img
from google.colab.patches import cv2_imshow

import random

Importing the Dataset¶

Instructions to access the data through Google Colab:

Follow the below steps:

  1. Download the zip file from Olympus.

  2. Upload the file into your drive and unzip the folder using the code provided in the notebook. Do not unzip it manually.

  3. Please check that the name of the file in your Google Drive is 'Food_Data.zip'. If it's not, then you may rename it on your drive or change it in the following cell.

  4. Now, you can run the following cell. If all the earlier steps were done correctly, the dataset will be imported without any errors.

In [ ]:
# Unzip the data - We only need to do this once, comment this out if it has already been done

# Storing the path of the data file from the Google drive
path = '/content/drive/MyDrive/MIT - Data Sciences/Colab Notebooks/Week_Six_-_Deep_Learning/Guided_Project_Food_Image_Classification/Food_Data.zip'

# The data is provided as a zip file so we need to extract the files from the zip file
with zipfile.ZipFile(path, 'r') as zip_ref:
    zip_ref.extractall()                      # Places data in the Content folder

Preparing the Data¶

The dataset has two folders, i.e., 'Training' and 'Testing'. Each of these folders has three sub-folders, namely 'Bread', 'Soup', and 'Vegetable-Fruit'. We will have the Training and Testing path stored in a variable named 'DATADIR'. The names of the sub-folders, which will be the classes for our classification task will be stored in an array called 'CATEGORIES'.

Training Data¶

We will convert each image into arrays and store them in an array called 'training_data' along with their class index.

In [ ]:
# Storing the training path in a variable named DATADIR, and storing the unique categories/labels in a list

DATADIR = "/content/Food_Data/Training"                                        # Path of training data after unzipping
CATEGORIES = ["Bread", "Soup", "Vegetable-Fruit"]                              # Storing all the categories in 'CATEGORIES' variable
IMG_SIZE = 150                                                                 # Defining the size of the image to 150
In [ ]:
# Here we will be using a user defined function create_training_data() to extract the images from the directory
training_data = []

# Storing all the training images
def create_training_data():
    for category in CATEGORIES:                                                # Looping over each category from the CATEGORIES list
        path = os.path.join(DATADIR, category)                                 # Joining images with labels
        class_num = category

        for img in os.listdir(path):
            img_array = cv2.imread(os.path.join(path, img))                    # Reading the data

            new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))            # Resizing the images

            training_data.append([new_array, class_num])                       # Appending both the images and labels

create_training_data()

Testing Data¶

We will do the same operation with our Testing data. We will convert each images into arrays and then append them to our array named 'testing_data' along with their class indexes.

In [ ]:
DATADIR_test = "/content/Food_Data/Testing"                                    # Path of training data after unzipping
CATEGORIES =  ["Bread", "Soup", "Vegetable-Fruit"]                             # Storing all the categories in categories variable
IMG_SIZE = 150                                                                 # Defining the size of the image to 150
In [ ]:
# Here we will be using a user defined function create_testing_data() to extract the images from the directory
testing_data = []

# Storing all the testing images
def create_testing_data():
    for category in CATEGORIES:                                                # Looping over each category from the CATEGORIES list
        path = os.path.join(DATADIR_test, category)                            # Joining images with labels
        class_num = category

        for img in os.listdir(path):
            img_array = cv2.imread(os.path.join(path, img))                    # Reading the data

            new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))            # Resizing the images

            testing_data.append([new_array, class_num])                        # Appending both the images and labels

create_testing_data()

Visualizing images randomly from each class¶


Bread¶


In [ ]:
bread_imgs = [fn for fn in os.listdir(f'{DATADIR}/{CATEGORIES[0]}') ]
select_bread = np.random.choice(bread_imgs, 9, replace = False)               # replace = True would allow the same image to be part of the random pool again

fig = plt.figure(figsize = (10, 10))

for i in range(9):
    ax = fig.add_subplot(4, 3, i + 1)

    fp = f'{DATADIR}/{CATEGORIES[0]}/{select_bread[i]}'

    fn = load_img(fp, target_size = (150, 150))

    plt.imshow(fn, cmap = 'Greys_r')

    plt.axis('off')

plt.show()
No description has been provided for this image

Observations:

  • Most bread items have a round, oval or elliptical shape, except for sandwiches.

  • Almost all bread items have a grilled or charred portion, which can be an easily recognizable feature to our Neural Network.


Soup¶


In [ ]:
soup_imgs = [fn for fn in os.listdir(f'{DATADIR}/{CATEGORIES[1]}') ]
select_soup = np.random.choice(soup_imgs, 9, replace = False)

fig = plt.figure(figsize = (10, 10))

for i in range(9):
    ax = fig.add_subplot(4, 3, i + 1)

    fp = f'{DATADIR}/{CATEGORIES[1]}/{select_soup[i]}'

    fn = load_img(fp, target_size = (150, 150))

    plt.imshow(fn, cmap = 'Greys_r')

    plt.axis('off')

plt.show()
No description has been provided for this image

Observations:

  • All Soup images are defined by a liquid taking on the shape of the container or utensil it is kept in.

  • There is a distinct glare from the reflection of light on most of the images.

  • Also, almost all of these images have a utensil, which can be a feature that confuses the model between bread and soup. As, images from both the classes mostly contain a dish or a bowl, where they are placed.


Vegetable-Fruit¶


In [ ]:
vegetable_fruit_imgs = [fn for fn in os.listdir(f'{DATADIR}/{CATEGORIES[2]}') ]
select_vegetable_fruit = np.random.choice(vegetable_fruit_imgs, 9, replace = False)

fig = plt.figure(figsize = (10, 10))

for i in range(9):
    ax = fig.add_subplot(4, 3, i + 1)

    fp = f'{DATADIR}/{CATEGORIES[2]}/{select_vegetable_fruit[i]}'

    fn = load_img(fp, target_size = (150, 150))

    plt.imshow(fn, cmap = 'Greys_r')

    plt.axis('off')

plt.show()
No description has been provided for this image

Observation:

  • Most of the images in these classes have vibrant colors and a repeating shape throughout the image.

Data Preprocessing¶

The arrays training_data and testing_data had the images stored as arrays with their corresponding labels as the class indexes. So in essence, our training_data and testing_data were arrays of tuples, where each tuple contained the image and its label.

In the following cells, we will unpack the tuples. We will shuffle our training_data and testing_data, and store the images in X_train, and X_test, and the labels in y_train, and y_test respectively.

In [ ]:
# Creating two different lists to store the Numpy arrays and the corresponding labels
X_train = []
y_train = []

np.random.shuffle(training_data)                                               # Shuffling data to reduce variance and making sure that model remains general and overfit less
for features, label in training_data:                                          # Iterating over the training data which is generated from the create_training_data() function
    X_train.append(features)                                                   # Appending images into X_train
    y_train.append(label)                                                      # Appending labels into y_train
In [ ]:
# Creating two different lists to store the Numpy arrays and the corresponding labels
X_test = []
y_test = []

np.random.shuffle(testing_data)                                                # Shuffling data to reduce variance and making sure that model remains general and overfit less
for features, label in testing_data:                                           # Iterating over the training data which is generated from the create_testing_data() function
    X_test.append(features)                                                    # Appending images into X_test
    y_test.append(label)                                                       # Appending labels into y_test
In [ ]:
# Converting the pixel values into Numpy array
X_train = np.array(X_train)
X_test = np.array(X_test)
X_train.shape
Out[ ]:
(3203, 150, 150, 3)

Note: Images are digitally represented in the form of NumPy arrays which can be observed from the X_train values generated above, so it is possible to perform all the preprocessing operations and build our CNN model using NumPy arrays directly. So, even if the data is provided in the form of NumPy arrays rather than images, we can use this to work on our model.

In [ ]:
# Converting the lists into DataFrames
y_train = pd.DataFrame(y_train, columns = ["Label"], dtype = object)
y_test = pd.DataFrame(y_test, columns = ["Label"], dtype = object)

Since the given data is stored in variables X_train, X_test, y_train, and y_test, there is no need to split the data further.

Checking Distribution of Classes¶

In [ ]:
# Printing the value counts of target variable
count = y_train.Label.value_counts()
print(count)

print('*'*10)

count = y_train.Label.value_counts(normalize = True)
print(count)
Label
Soup               1500
Bread               994
Vegetable-Fruit     709
Name: count, dtype: int64
**********
Label
Soup               0.468311
Bread              0.310334
Vegetable-Fruit    0.221355
Name: proportion, dtype: float64

Normalizing the data¶

In neural networks, it is always suggested to normalize the feature inputs. Normalization has the below benefits while training the model of a neural network:

  1. Normalization makes the training faster and reduces the chances of getting stuck at local optima.
  2. In deep neural networks, normalization helps to avoid exploding gradient problems. Gradient exploding problem occurs when large error gradients accumulate and result in very large updates to neural network model weights during training. This makes a model unstable and unable to learn from the training data.

As we know image pixel values range from 0-255, here we are simply dividing all the pixel values by 255 to standardize all the images to have values between 0-1.

In [ ]:
# Normalizing the image data
X_train = X_train/255.0

X_test = X_test/255.0

Encoding Target Variable¶

For any ML or DL techniques, the labels must be encoded into numbers or arrays, so that we can compute the cost between the predicted and the real labels.

In this case, we have 3 classes "Bread", "Soup", and "Vegetable-Fruit". We want the corresponding labels to look like:

  • [1, 0, 0] --------- Bread
  • [0, 1, 0] --------- Soup
  • [0, 0, 1] --------- Vegetable-Fruit

Each class will be represented in the form of an array.

In [ ]:
y_train_encoded = [ ]

for label_name in y_train["Label"]:
    if(label_name == 'Bread'):
        y_train_encoded.append(0)

    if(label_name == 'Soup'):
        y_train_encoded.append(1)

    if(label_name == 'Vegetable-Fruit'):
        y_train_encoded.append(2)

y_train_encoded = to_categorical(y_train_encoded, 3)                           # Convert to one hot encoded
y_train_encoded
Out[ ]:
array([[1., 0., 0.],
       [0., 0., 1.],
       [0., 1., 0.],
       ...,
       [0., 1., 0.],
       [0., 1., 0.],
       [0., 1., 0.]])
In [ ]:
y_test_encoded = [ ]

for label_name in y_test["Label"]:
    if(label_name == 'Bread'):
        y_test_encoded.append(0)

    if(label_name == 'Soup'):
        y_test_encoded.append(1)

    if(label_name == 'Vegetable-Fruit'):
        y_test_encoded.append(2)

y_test_encoded = to_categorical(y_test_encoded, 3)
y_test_encoded
Out[ ]:
array([[1., 0., 0.],
       [1., 0., 0.],
       [0., 0., 1.],
       ...,
       [0., 1., 0.],
       [1., 0., 0.],
       [1., 0., 0.]])

Model Building¶

Now that we have done data preprocessing, let's build the first Convolutional Neural Network (CNN) model.

Model 1 Architecture:¶

  • The first CNN Model will have three convolutional blocks.
  • Each convolutional block will have a Conv2D layer and a MaxPooling2D Layer.
  • Add first Conv2D layer with 64 filters and a kernel size of 3x3. Use the 'same' padding and provide the input shape = (150, 150, 3). Use 'relu' activation.
  • Add MaxPooling2D layer with kernel size 2x2 and use padding = 'same'.
  • Add a second Conv2D layer with 32 filters and a kernel size of 3x3. Use the 'same' padding and 'relu activation.
  • Follow it up with another MaxPooling2D layer kernel size 2x2 and use padding = 'same'.
  • Add a third Conv2D layer with 32 filters and the kernel size of 3x3. Use the 'same' padding and 'relu activation. Once again, follow it up with another Maxpooling2D layer with kernel size 2x2 and padding = 'same'.
  • Once the convolutional blocks are added, add the Flatten layer.
  • Finally, add dense layers.
  • Add first Dense layer with 100 neurons and 'relu' activation
  • The last dense layer must have as many neurons as the number of classes, which in this case is 3 and use 'softmax' activation.
  • Initialize SGD optimizer with learning rate = 0.01 and momentum = 0.9
  • Compile your model using the optimizer you initialized and use categorical_crossentropy as the loss function and 'accuracy' as the metric
  • Print the model summary and write down your observations/insights about the model.

Note: We need to clear the previous model's history from the Keras backend. Also, we must fix the seed for random number generators after clearing the backend to make sure we receive the same output every time we run the code.

In [ ]:
backend.clear_session()                                                        # Clear the session backend for memory conservation

# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
seed = 42

np.random.seed(seed)
random.seed(seed)
tf.random.set_seed(seed)
In [ ]:
# Intializing a sequential model
model = Sequential()

# Adding first conv layer with 64 filters and kernel size 3x3, padding 'same' provides the output size same as the input size
# The input_shape denotes input image dimension
model.add(                                                                     # See OneNote for complete code breakdown
          Conv2D(                                                              # Adds a Convolutional 2D (Conv2D) layer to the model.
                  64,                                                          # Specifies the number of filters (or kernels) in the Conv2D layer
                  (3, 3),                                                      # Defines the kernel size or filter size
                  activation = 'relu',                                         # Applies the ReLU (Rectified Linear Unit) activation function to the output of the Conv2D layer
                  padding = "same",                                            # Specifies the padding strategy. "Same" padding means that the input image is padded with zeros such that the output image has the same dimensions as the input (150x150)
                  input_shape = (150, 150, 3)                                  # Defines the input shape to the Conv2D layer
                  )
          )

# Adding max pooling to reduce the size of output of first conv layer
model.add(                                                                     # See OneNote for complete code breakdown
          MaxPooling2D(                                                        # MaxPooling helps reduce the spatial dimensions (height and width) of the feature maps, reduces computational complexity and helps avoid overfitting
                      (2, 2),                                                  # Specifies the pool size, a 2x2 window will slide over the input feature map
                      padding = 'same'                                         # Ensures that the output feature map retains the same dimensions as the input feature map
                      )
          )

# Adding second conv layer with 32 filters and kernel size 3x3, padding 'same' followed by a Maxpooling2D layer
model.add(Conv2D(32, (3, 3), activation = 'relu', padding = "same"))
model.add(MaxPooling2D((2, 2), padding = 'same'))

# Add third conv layer with 32 filters and kernel size 3x3, padding 'same' followed by a Maxpooling2D layer
model.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
model.add(MaxPooling2D((2, 2), padding = 'same'))

# Flattening the output of the conv layer after max pooling to make it ready for creating dense connections
model.add(Flatten())

# Adding a fully connected dense layer with 100 neurons
model.add(Dense(100, activation = 'relu'))

# Adding the output layer with 3 neurons and activation functions as softmax since this is a multi-class classification problem
model.add(Dense(3, activation = 'softmax'))

# Using SGD Optimizer
opt = SGD(learning_rate = 0.01, momentum = 0.9)                                # Instatiate an optimizer

# Compiling the model
model.compile(optimizer = opt, loss = 'categorical_crossentropy', metrics = ['accuracy'])

# Generating the summary of the model
model.summary()
/usr/local/lib/python3.10/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                      │ (None, 150, 150, 64)        │           1,792 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d (MaxPooling2D)         │ (None, 75, 75, 64)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_1 (Conv2D)                    │ (None, 75, 75, 32)          │          18,464 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_1 (MaxPooling2D)       │ (None, 38, 38, 32)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_2 (Conv2D)                    │ (None, 38, 38, 32)          │           9,248 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_2 (MaxPooling2D)       │ (None, 19, 19, 32)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten (Flatten)                    │ (None, 11552)               │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense)                        │ (None, 100)                 │       1,155,300 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 3)                   │             303 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 1,185,107 (4.52 MB)
 Trainable params: 1,185,107 (4.52 MB)
 Non-trainable params: 0 (0.00 B)

Observations:

  • As we can see from the above summary, this CNN model will train and learn 1,185,107 parameters (weights and biases).
  • There are no non-trainable parameters in the model.
  • The model is fairly large and we might expect overfitting.

Training the Model¶

Let's now train the model using the training data.

In [ ]:
# The following lines of code saves the best model's parameters if training accuracy goes down on further training
es = EarlyStopping(
                  monitor = 'val_loss',                                        # Specifies what metric to monitor during training
                  mode = 'min',                                                # Defines whether the EarlyStopping should stop when the monitored metric is minimized or maximized
                  verbose = 1,                                                 # Controls the level of output in the training logs
                  patience = 5                                                 # Defines how many epochs to wait after the last improvement before stopping
                  )
# mc = ModelCheckpoint('best_model.h5', monitor = 'val_accuracy', mode = 'max', verbose = 1, save_best_only = True)     # Original code contains an error

mc = ModelCheckpoint(                                                          # Corrected code
                    'best_model.keras',
                    monitor = 'val_accuracy',
                    mode = 'max',
                    verbose = 1,
                    save_best_only = True)

# Fitting the model with 30 epochs and validation_split as 10%
history = model.fit(
                    X_train,
                    y_train_encoded,
                    epochs = 60,
                    batch_size= 32,
                    validation_split = 0.10,
                    callbacks = [es, mc]
                    )
Epoch 1/60
91/91 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step - accuracy: 0.4704 - loss: 1.0586
Epoch 1: val_accuracy improved from -inf to 0.51090, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 11s 61ms/step - accuracy: 0.4706 - loss: 1.0585 - val_accuracy: 0.5109 - val_loss: 1.0930
Epoch 2/60
89/91 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step - accuracy: 0.4852 - loss: 1.0538
Epoch 2: val_accuracy improved from 0.51090 to 0.62617, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 2s 26ms/step - accuracy: 0.4857 - loss: 1.0525 - val_accuracy: 0.6262 - val_loss: 0.8103
Epoch 3/60
89/91 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step - accuracy: 0.5458 - loss: 0.9392
Epoch 3: val_accuracy did not improve from 0.62617
91/91 ━━━━━━━━━━━━━━━━━━━━ 3s 26ms/step - accuracy: 0.5459 - loss: 0.9387 - val_accuracy: 0.5234 - val_loss: 1.0298
Epoch 4/60
91/91 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step - accuracy: 0.5582 - loss: 0.9195
Epoch 4: val_accuracy did not improve from 0.62617
91/91 ━━━━━━━━━━━━━━━━━━━━ 3s 27ms/step - accuracy: 0.5582 - loss: 0.9193 - val_accuracy: 0.5826 - val_loss: 0.8939
Epoch 5/60
91/91 ━━━━━━━━━━━━━━━━━━━━ 0s 26ms/step - accuracy: 0.5939 - loss: 0.8284
Epoch 5: val_accuracy did not improve from 0.62617
91/91 ━━━━━━━━━━━━━━━━━━━━ 3s 28ms/step - accuracy: 0.5939 - loss: 0.8283 - val_accuracy: 0.5888 - val_loss: 0.8565
Epoch 6/60
91/91 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step - accuracy: 0.6086 - loss: 0.7836
Epoch 6: val_accuracy improved from 0.62617 to 0.63240, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 2s 27ms/step - accuracy: 0.6087 - loss: 0.7834 - val_accuracy: 0.6324 - val_loss: 0.7492
Epoch 7/60
91/91 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step - accuracy: 0.6366 - loss: 0.7300
Epoch 7: val_accuracy did not improve from 0.63240
91/91 ━━━━━━━━━━━━━━━━━━━━ 2s 26ms/step - accuracy: 0.6368 - loss: 0.7298 - val_accuracy: 0.6168 - val_loss: 0.7717
Epoch 8/60
89/91 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step - accuracy: 0.6642 - loss: 0.6997
Epoch 8: val_accuracy improved from 0.63240 to 0.63863, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 3s 27ms/step - accuracy: 0.6647 - loss: 0.6988 - val_accuracy: 0.6386 - val_loss: 0.7280
Epoch 9/60
89/91 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step - accuracy: 0.6887 - loss: 0.6560
Epoch 9: val_accuracy improved from 0.63863 to 0.65732, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 2s 26ms/step - accuracy: 0.6894 - loss: 0.6550 - val_accuracy: 0.6573 - val_loss: 0.7174
Epoch 10/60
89/91 ━━━━━━━━━━━━━━━━━━━━ 0s 26ms/step - accuracy: 0.7225 - loss: 0.5960
Epoch 10: val_accuracy did not improve from 0.65732
91/91 ━━━━━━━━━━━━━━━━━━━━ 3s 28ms/step - accuracy: 0.7227 - loss: 0.5956 - val_accuracy: 0.6542 - val_loss: 0.7569
Epoch 11/60
88/91 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step - accuracy: 0.7229 - loss: 0.5710
Epoch 11: val_accuracy improved from 0.65732 to 0.68536, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 5s 26ms/step - accuracy: 0.7239 - loss: 0.5698 - val_accuracy: 0.6854 - val_loss: 0.9102
Epoch 12/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step - accuracy: 0.7845 - loss: 0.4861
Epoch 12: val_accuracy did not improve from 0.68536
91/91 ━━━━━━━━━━━━━━━━━━━━ 2s 25ms/step - accuracy: 0.7848 - loss: 0.4859 - val_accuracy: 0.6449 - val_loss: 1.0035
Epoch 13/60
89/91 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step - accuracy: 0.8136 - loss: 0.4587
Epoch 13: val_accuracy did not improve from 0.68536
91/91 ━━━━━━━━━━━━━━━━━━━━ 2s 25ms/step - accuracy: 0.8143 - loss: 0.4573 - val_accuracy: 0.6698 - val_loss: 0.8364
Epoch 14/60
89/91 ━━━━━━━━━━━━━━━━━━━━ 0s 26ms/step - accuracy: 0.8688 - loss: 0.3310
Epoch 14: val_accuracy did not improve from 0.68536
91/91 ━━━━━━━━━━━━━━━━━━━━ 3s 28ms/step - accuracy: 0.8697 - loss: 0.3291 - val_accuracy: 0.6604 - val_loss: 0.9961
Epoch 14: early stopping

Plotting the Training and Validation Accuracies¶

In [ ]:
# Plotting the training and validation accuracies for each epoch

plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()
No description has been provided for this image

Checking Test Accuracy¶

In [ ]:
model.evaluate(X_test, (y_test_encoded))
35/35 ━━━━━━━━━━━━━━━━━━━━ 1s 27ms/step - accuracy: 0.6448 - loss: 1.1261
Out[ ]:
[1.085956335067749, 0.660877525806427]

Observations:

  • The training didn't continue for all of the 60 epochs. The training stopped because the performance wasn't improving beyond a certain point.
  • From the above plot, we observe that the training accuracy is continuously improving. However, it was not the case with the validation accuracy. The validation accuracy started fluctuating after 5 epochs.
  • All the above observations suggest that the model was overfitting on the training data.
  • However, the model was consistent on validation and test data.

Plotting Confusion Matrix¶

In [ ]:
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix

pred = model.predict(X_test)
pred = np.argmax(pred, axis = 1)
y_true = np.argmax(y_test_encoded, axis = 1)

# Printing the classification report
print(classification_report(y_true, pred))

# Plotting the heatmap using confusion matrix
cm = confusion_matrix(y_true, pred)
plt.figure(figsize = (8, 5))
sns.heatmap(cm, annot = True,  fmt = '.0f', xticklabels = ['Bread', 'Soup', 'Vegetable-Fruit'], yticklabels=['Bread', 'Soup', 'Vegetable-Fruit'])
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()
35/35 ━━━━━━━━━━━━━━━━━━━━ 1s 15ms/step
              precision    recall  f1-score   support

           0       0.64      0.43      0.51       362
           1       0.73      0.75      0.74       500
           2       0.57      0.83      0.67       232

    accuracy                           0.66      1094
   macro avg       0.65      0.67      0.64      1094
weighted avg       0.67      0.66      0.65      1094

No description has been provided for this image

Observations:

  • The model is giving about 70% accuracy on the test data
  • There have been many misclassifications between all classes.
  • A large number of images of 'Bread' were predicted to be 'Soup'. We had earlier predicted this because both these classes show the presence of a dish or a utensil in the images.
  • There have been misclassifications between 'Bread' and 'Vegetable-Fruit' as well. We can attribute this to the presence of yellowish pixels in both. Hence, the model might have taken one for the other.
  • The misclassifications between 'Vegetable-Fruit' and 'Soup' have been the least, as we can see that there is minimal visual overlap among these classes.

Let's try to build another model with a different architecture and see if we can improve the model performance. Since the first model was overfitting, we will add Dropout layers at the end of each convolutional block.

Model 2 Architecture:¶

  • We plan on having 4 convolutional blocks in this Architecture, each having a Conv2D, MaxPooling2D, and a Dropout layer.
  • Add first Conv2D layer with 256 filters and a kernel size of 5x5. Use the 'same' padding and provide the input shape = (150, 150, 3). Use 'relu' activation.
  • Add MaxPooling2D layer with kernel size 2x2 and stride size 2x2.
  • Add a Dropout layer with a dropout ratio of 0.25.
  • Add a second Conv2D layer with 128 filters and a kernel size of 5x5. Use the 'same' padding and 'relu' activation.
  • Follow this up with a similar Maxpooling2D layer like above and a Dropout layer with 0.25 dropout ratio.
  • Add a third Conv2D layer with 64 filters and a kernel size of 3x3. Use the 'same' padding and 'relu' activation.
  • Follow this up with a similar Maxpooling2D layer and a Dropout layer with dropout ratio of 0.25.
  • Add a fourth Conv2D layer with 32 filters and a kernel size of 3x3. Use the 'same' padding and 'relu' activation.
  • Follow this up with a similar Maxpooling2D layer and a Dropout layer with dropout ratio of 0.25.
  • Once the convolutional blocks are added, add the Flatten layer.
  • Add first fully connected dense layer with 64 neurons and use 'relu' activation.
  • Add a second fully connected dense layer with 32 neurons and use 'relu' activation.
  • Add your final dense layer with 3 neurons and use 'softmax' activation function.
  • Initialize an Adam optimizer with a learning rate of 0.001.
  • Compile your model with the optimizer you initialized and use categorical_crossentropy as the loss function and the 'accuracy' as the metric.
  • Print your model summary and write down your observations.
In [ ]:
from tensorflow.keras import backend
backend.clear_session()

# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(42)
import random
random.seed(42)
tf.random.set_seed(42)
In [ ]:
# Initializing a sequential model
model_2 = Sequential()

# Adding first conv layer with 256 filters and kernel size 5x5, with ReLU activation and padding 'same' provides the output size same as the input size
# The input_shape denotes input image dimension
model_2.add(Conv2D(filters = 256, kernel_size = (5, 5), padding = 'Same', activation = 'relu', input_shape = (150, 150, 3)))

# Adding max pooling to reduce the size of output of first conv layer
model_2.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2)))

#  Adding dropout to randomly switch off 25% neurons to reduce overfitting
model_2.add(Dropout(0.25))

# Adding second conv layer with 128 filters and with kernel size 5x5 and ReLu activation function
model_2.add(Conv2D(filters = 128, kernel_size = (5, 5), padding = 'Same', activation = 'relu'))

# Adding max pooling to reduce the size of output of first conv layer
model_2.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2)))

#  Adding dropout to randomly switch off 25% neurons to reduce overfitting
model_2.add(Dropout(0.25))

# Adding third conv layer with 64 filters and with kernel size 3x3 and ReLu activation function
model_2.add(Conv2D(filters = 64, kernel_size = (3, 3), padding = 'Same', activation = 'relu'))

# Adding max pooling to reduce the size of output of first conv layer
model_2.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2)))

#  Adding dropout to randomly switch off 25% neurons to reduce overfitting
model_2.add(Dropout(0.25))

# Adding fourth conv layer with 32 filters and with kernel size 3x3 and ReLu activation function
model_2.add(Conv2D(filters = 32, kernel_size = (3, 3), padding = 'Same', activation = 'relu'))

# Adding max pooling to reduce the size of output of first conv layer
model_2.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2)))

#  Adding dropout to randomly switch off 25% neurons to reduce overfitting
model_2.add(Dropout(0.25))

# Flattening the 3-d output of the conv layer after max pooling to make it ready for creating dense connections
model_2.add(Flatten())

# Adding first fully connected dense layer with 64 neurons
model_2.add(Dense(64, activation = "relu"))

# Adding second fully connected dense layer with 32 neurons
model_2.add(Dense(32, activation = "relu"))

# Adding the output layer with 3 neurons and activation functions as softmax since this is a multi-class classification problem
model_2.add(Dense(3, activation = "softmax"))

# Using Adam Optimizer
optimizer = Adam(learning_rate = 0.001)

# Compile the model
model_2.compile(optimizer = optimizer , loss = "categorical_crossentropy", metrics = ["accuracy"])
/usr/local/lib/python3.10/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
In [ ]:
model_2.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                      │ (None, 150, 150, 256)       │          19,456 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d (MaxPooling2D)         │ (None, 75, 75, 256)         │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout (Dropout)                    │ (None, 75, 75, 256)         │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_1 (Conv2D)                    │ (None, 75, 75, 128)         │         819,328 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_1 (MaxPooling2D)       │ (None, 37, 37, 128)         │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_1 (Dropout)                  │ (None, 37, 37, 128)         │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_2 (Conv2D)                    │ (None, 37, 37, 64)          │          73,792 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_2 (MaxPooling2D)       │ (None, 18, 18, 64)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_2 (Dropout)                  │ (None, 18, 18, 64)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_3 (Conv2D)                    │ (None, 18, 18, 32)          │          18,464 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_3 (MaxPooling2D)       │ (None, 9, 9, 32)            │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_3 (Dropout)                  │ (None, 9, 9, 32)            │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten (Flatten)                    │ (None, 2592)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense)                        │ (None, 64)                  │         165,952 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           2,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 3)                   │              99 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 1,099,171 (4.19 MB)
 Trainable params: 1,099,171 (4.19 MB)
 Non-trainable params: 0 (0.00 B)

Observations:

  • We can observe from the above summary that this CNN model will train and learn 1,099,171 parameters (weights and biases).** However, since we have Dropout layers, as the training progresses, few of the neurons will be dropped and thus effective trainable parameters will also be less.
  • This model has more convolutional blocks and hence, we can expect this model to perform better in extracting features from the images.
  • We are using a different optimizer. i.e., Adam. Let's see if we receive any improvement in performance.

Training the Model¶

Let's now train the model using the training data.

In [ ]:
es = EarlyStopping(monitor = 'val_loss', mode = 'min', verbose = 1, patience = 5)
mc = ModelCheckpoint('best_model.keras', monitor = 'val_accuracy', mode = 'max', verbose = 1, save_best_only = True)

'''# There is an error with this code
history=model_2.fit(X_train,
          y_train_encoded,
          epochs = 60,
          batch_size = 32, validation_split = 0.10, use_multiprocessing = True)
'''

history=model_2.fit(
                    X_train,
                    y_train_encoded,
                    epochs = 60,
                    batch_size = 32,
                    validation_split = 0.10,
                    callbacks = [es, mc]
                    )
Epoch 1/60
91/91 ━━━━━━━━━━━━━━━━━━━━ 0s 245ms/step - accuracy: 0.4347 - loss: 1.0847
Epoch 1: val_accuracy improved from -inf to 0.51090, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 49s 323ms/step - accuracy: 0.4349 - loss: 1.0845 - val_accuracy: 0.5109 - val_loss: 1.0238
Epoch 2/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 156ms/step - accuracy: 0.4736 - loss: 1.0476
Epoch 2: val_accuracy improved from 0.51090 to 0.60436, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 49s 160ms/step - accuracy: 0.4735 - loss: 1.0474 - val_accuracy: 0.6044 - val_loss: 0.9109
Epoch 3/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 158ms/step - accuracy: 0.5630 - loss: 0.9060
Epoch 3: val_accuracy did not improve from 0.60436
91/91 ━━━━━━━━━━━━━━━━━━━━ 21s 160ms/step - accuracy: 0.5630 - loss: 0.9056 - val_accuracy: 0.6012 - val_loss: 0.8297
Epoch 4/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 156ms/step - accuracy: 0.5976 - loss: 0.8418
Epoch 4: val_accuracy improved from 0.60436 to 0.62305, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 20s 160ms/step - accuracy: 0.5974 - loss: 0.8417 - val_accuracy: 0.6231 - val_loss: 0.7941
Epoch 5/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 155ms/step - accuracy: 0.6128 - loss: 0.7874
Epoch 5: val_accuracy improved from 0.62305 to 0.64798, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 20s 158ms/step - accuracy: 0.6128 - loss: 0.7870 - val_accuracy: 0.6480 - val_loss: 0.7319
Epoch 6/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 156ms/step - accuracy: 0.6155 - loss: 0.7641
Epoch 6: val_accuracy improved from 0.64798 to 0.65421, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 21s 162ms/step - accuracy: 0.6155 - loss: 0.7639 - val_accuracy: 0.6542 - val_loss: 0.7065
Epoch 7/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 156ms/step - accuracy: 0.6332 - loss: 0.7349
Epoch 7: val_accuracy improved from 0.65421 to 0.67290, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 15s 164ms/step - accuracy: 0.6330 - loss: 0.7349 - val_accuracy: 0.6729 - val_loss: 0.6936
Epoch 8/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 154ms/step - accuracy: 0.6401 - loss: 0.7189
Epoch 8: val_accuracy did not improve from 0.67290
91/91 ━━━━━━━━━━━━━━━━━━━━ 20s 157ms/step - accuracy: 0.6403 - loss: 0.7185 - val_accuracy: 0.6604 - val_loss: 0.6934
Epoch 9/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 158ms/step - accuracy: 0.6984 - loss: 0.6560
Epoch 9: val_accuracy improved from 0.67290 to 0.68224, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 15s 161ms/step - accuracy: 0.6983 - loss: 0.6558 - val_accuracy: 0.6822 - val_loss: 0.6562
Epoch 10/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 155ms/step - accuracy: 0.7311 - loss: 0.5999
Epoch 10: val_accuracy improved from 0.68224 to 0.68536, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 14s 158ms/step - accuracy: 0.7312 - loss: 0.5997 - val_accuracy: 0.6854 - val_loss: 0.6506
Epoch 11/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 154ms/step - accuracy: 0.7646 - loss: 0.5418
Epoch 11: val_accuracy improved from 0.68536 to 0.71340, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 21s 158ms/step - accuracy: 0.7647 - loss: 0.5414 - val_accuracy: 0.7134 - val_loss: 0.6155
Epoch 12/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 156ms/step - accuracy: 0.7906 - loss: 0.4950
Epoch 12: val_accuracy did not improve from 0.71340
91/91 ━━━━━━━━━━━━━━━━━━━━ 20s 158ms/step - accuracy: 0.7907 - loss: 0.4949 - val_accuracy: 0.7040 - val_loss: 0.6468
Epoch 13/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 156ms/step - accuracy: 0.8059 - loss: 0.4513
Epoch 13: val_accuracy improved from 0.71340 to 0.72897, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 21s 164ms/step - accuracy: 0.8061 - loss: 0.4510 - val_accuracy: 0.7290 - val_loss: 0.6031
Epoch 14/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 156ms/step - accuracy: 0.8268 - loss: 0.4075
Epoch 14: val_accuracy did not improve from 0.72897
91/91 ━━━━━━━━━━━━━━━━━━━━ 20s 161ms/step - accuracy: 0.8264 - loss: 0.4082 - val_accuracy: 0.7072 - val_loss: 0.6395
Epoch 15/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 155ms/step - accuracy: 0.8339 - loss: 0.3977
Epoch 15: val_accuracy did not improve from 0.72897
91/91 ━━━━━━━━━━━━━━━━━━━━ 20s 158ms/step - accuracy: 0.8339 - loss: 0.3978 - val_accuracy: 0.6573 - val_loss: 0.8049
Epoch 16/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 155ms/step - accuracy: 0.8262 - loss: 0.4119
Epoch 16: val_accuracy did not improve from 0.72897
91/91 ━━━━━━━━━━━━━━━━━━━━ 20s 158ms/step - accuracy: 0.8263 - loss: 0.4116 - val_accuracy: 0.7165 - val_loss: 0.7693
Epoch 17/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 155ms/step - accuracy: 0.8396 - loss: 0.3647
Epoch 17: val_accuracy did not improve from 0.72897
91/91 ━━━━━━━━━━━━━━━━━━━━ 21s 161ms/step - accuracy: 0.8398 - loss: 0.3644 - val_accuracy: 0.6916 - val_loss: 0.8216
Epoch 18/60
90/91 ━━━━━━━━━━━━━━━━━━━━ 0s 155ms/step - accuracy: 0.8436 - loss: 0.3614
Epoch 18: val_accuracy improved from 0.72897 to 0.75078, saving model to best_model.keras
91/91 ━━━━━━━━━━━━━━━━━━━━ 20s 159ms/step - accuracy: 0.8440 - loss: 0.3609 - val_accuracy: 0.7508 - val_loss: 0.7519
Epoch 18: early stopping

Plotting the Training and Validation Accuracies¶

In [ ]:
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc = 'upper left')
plt.show()
No description has been provided for this image

Checking Test Accuracy¶

In [ ]:
model_2.evaluate(X_test, y_test_encoded)
35/35 ━━━━━━━━━━━━━━━━━━━━ 3s 98ms/step - accuracy: 0.7329 - loss: 0.7416
Out[ ]:
[0.6977789402008057, 0.7376599907875061]

Observations:

  • By comparing the train and validation accuracy, it seems the model is not overfitting as much. So adding Dropout layers definitely proved beneficial.
  • The training also ran for more epochs. So training accuracy never stayed stagnant. It showed improvement throughout.
  • The validation accuracy stopped showing any significant improvements after about 10 epochs, however, the test accuracy has improved significantly.

Plotting Confusion Matrix¶

In [ ]:
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix

pred = model_2.predict(X_test)
pred = np.argmax(pred, axis = 1)
y_true = np.argmax(y_test_encoded, axis = 1)

#Printing the classification report
print(classification_report(y_true, pred))

#Plotting the heatmap using confusion matrix
cm = confusion_matrix(y_true, pred)
plt.figure(figsize = (8, 5))
sns.heatmap(cm, annot = True,  fmt = '.0f', xticklabels = ['Bread', 'Soup', 'Vegetable-Fruit'], yticklabels = ['Bread', 'Soup', 'Vegetable-Fruit'])
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()
35/35 ━━━━━━━━━━━━━━━━━━━━ 2s 40ms/step
              precision    recall  f1-score   support

           0       0.69      0.55      0.61       362
           1       0.71      0.89      0.79       500
           2       0.93      0.70      0.80       232

    accuracy                           0.74      1094
   macro avg       0.78      0.71      0.73      1094
weighted avg       0.75      0.74      0.73      1094

No description has been provided for this image

Observations:

  • The misclassifications are very less in comparison to the previous model.
  • Bread and Soup continue to be the most misclassified classes. However, it's not as bad as the previous model.
  • We can still try to add more layers to see if we can bring down the misclassification further.

Prediction¶

Let us predict using the best model, i.e., model 2, by plotting one random image from X_test data and see if our best model is predicting the image correctly or not.

In [ ]:
# Plotting the test image
cv2_imshow(X_test[1] * 255)  # Multiplying with 255, because X_test was previously normalized.
i = y_test.Label[1]
i = np.argmax(i)
if(i == 0):
    plt.title("Bread")
if(i == 1):
    plt.title("Soup")
if(i == 2):
    plt.title("Vegetable-Fruit")

plt.axis('off')
plt.show()
No description has been provided for this image
No description has been provided for this image
In [ ]:
# Predicting the test image with the best model and storing the prediction value in res variable
res = model_2.predict(X_test[1].reshape(1, 150, 150, 3))
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 420ms/step
In [ ]:
# Applying argmax on the prediction to get the highest index value
i=np.argmax(res)
if(i == 0):
    print("Bread")
if(i==1):
    print("Soup")
if(i==2):
    print("Vegetable-Fruit")
Bread

Observation:

  • The model is able to correct classify the image we have randomly chosen from the test data.

Conclusion and Recommendations¶

  1. As we have seen, the second CNN model was able to predict the test image correctly with a test accuracy of close to 80%.

  2. There is still scope for improvement in the test accuracy of the CNN model chosen here. Different architectures and optimizers can be used to build a better food classifier.

  3. Transfer learning can be applied to the dataset to improve accuracy. You can choose among multiple pre-trained models available in the Keras framework.

  4. Once the desired performance is achieved from the model, the company can use it to classify different images being uploaded to the website.

  5. We can further try to improve the performance of the CNN model by using some of the below techniques and see if you can increase accuracy:

    • We can try hyperparameter tuning for some of the hyperparameters like the number of convolutional blocks, the number of filters in each Conv2D layer, filter size, activation function, adding/removing dropout layers, changing the dropout ratio, etc.
    • Data Augmentation might help to make the model more robust and invariant toward different orientations.
In [ ]:
# Convert notebook to html
!jupyter nbconvert --to html "/content/drive/MyDrive/MIT - Data Sciences/Colab Notebooks/Week_Six_-_Deep_Learning/Guided_Project_Food_Image_Classification/Food_Image_Classification_Mine.ipynb"