Aarzoo Goel's other Models Reports

Major Concepts

 

Sign-Up/Login to access Several ML Models and also Deploy & Monetize your own ML solutions for free

Models Home » Domain Usecases » Agriculture » Dandelions Image Detection

Dandelions Image Detection

Models Status

Model Overview

Dandelions (Binary Image Classification)


Problem Statement
These are images of dandelions and not-dandelions( grass or other). The goal of this project is a very simple binary image classification model for me to do some "real-world learning".
Initial thoughts and findings: Need lots and lots more images. While the training results in decent accuracy, the validation loss is substantial. My initial 1,200+ images (50% dandelion/50% not) seems woefully small.


Dataset: https://www.kaggle.com/coloradokb/dandelionimages
Dandelions: 635
Other: 627


Model:
 MobileNetV2


 Accuracy
 Validation Accuracy: 84% (We will take this as approximate accuracy and model can be more accurate in further studies with more data.)


Load the data and label: 
sdir=r'/content/gdrive/My Drive/Images/'
slist=os.listdir(sdir)
classes=[]
filepaths=[]
labels=[]
for d in slist:
    dpath=os.path.join(sdir, d)
    if os.path.isdir(dpath):
        classes.append(d)
class_count=len(classes)
for klass in classes:   
    classpath=os.path.join(sdir,klass)
    filelist=os.listdir(classpath)    
    for f in filelist:
        fpath=os.path.join(classpath, f)
        filepaths.append(fpath)
        labels.append(klass)
print ('number of files: ', len(filepaths), ' number of labels: ', len(labels))
file_series=pd.Series(filepaths, name='filepaths')
label_series=pd.Series(labels, name='labels')
df=pd.concat([file_series, label_series], axis=1)
print (df.head())/
balance=df['labels'].value_counts()
print (balance)

Image Generator:
It lets you augment your images in real-time while your model is still training. You can apply any random transformations on each training image as it is passed to the model. 


Code:
from tensorflow.python.keras.preprocessing.image import ImageDataGenerator
train_generator = ImageDataGenerator(
    preprocessing_function=tf.keras.applications.mobilenet_v2.preprocess_input,
    validation_split=0.2)
test_generator = ImageDataGenerator(
    preprocessing_function=tf.keras.applications.mobilenet_v2.preprocess_input )


 


Flow From Dataframe:
Takes the dataset and the path to a directory and generates batches.The generated batches contain augmented data.



  • dataframe: Dataset containing the file paths relative to the directory of the images in an object column. It includes other column/s depending on the class_mode, if class_mode is "categorical" it must include the y_col column with the class of each image. 

  • x_col: Column in dataframe that contains the filenames (or absolute paths if directory is None).

  • y_col: column in the dataset that has the target data.

  • target_sizeThe dimensions to which all images will be resized. Set to (224,224)

  • color_mode: one of "grayscale", "rgb", "rgba". Default is ‘rgb’. The images will be converted to 1 or 3 color channels.

  • class_mode: ‘categorical’, Categorical: 2D numpy array.

  • batch_size: The size of the batches of data (default: 32).




Code: 


train_images = train_generator.flow_from_dataframe(
    dataframe=X_train,
    x_col='filepaths',
    y_col='labels',
    target_size=(224, 224),
    color_mode='rgb',
    batch_size=16,
    class_mode='categorical',
    shuffle=True,
    subset='training'
)
val_images = train_generator.flow_from_dataframe( 
    dataframe=X_train, 
    x_col='filepaths', 
    y_col='labels',   
    batch_size=16,
    target_size=(224, 224), 
     color_mode='rgb', 
     class_mode='categorical',
    
       shuffle=True,  
 subset='validation'  
)
test_images = test_generator.flow_from_dataframe(
    dataframe=y_train,
    x_col='filepaths',
    y_col='labels',
    target_size=(224, 224),
    color_mode='rgb',
    batch_size=16,
    class_mode='categorical',
    shuffle=False
)


Modeling: 


MNobileNet V2:  Download tf.keras.applications.MobileNetV2 for your base model. This model expects pixel values in [-1, 1], but here, the pixel values in your images are in [0, 224]. To rescale them, use preprocessing method included with the model.
Create a base model from the MobileNet V2 model developed at Google, which is pre-trained on ImageNet dataset.First, you need to pick a layer of MobileNet V2 you will use for feature extraction. The very last classification layer is not much useful. Instead, you can follow the common practice to depend on the last layer before the flatten operation. This layer is called the 'bottleneck layer'. The bottleneck layer features retains more generality as compared to the final/top layer.
Firstly, a MobileNet V2 model pre-loaded with weights trained on ImageNet. By specifying the include_top=False, you load a network that doesn't include the classification layers at the top, which is good for feature extraction. 
Feature Extraction: You do not need to train the entire model again. The base convolutional network already contains features that are generically useful for classifying images. However, the classification part of the pretrained model is specific to the original classification model, and subsequently specificy to the set of classes on which the model was trained.


Modelling


feature_extractor = tf.keras.applications.MobileNetV2(input_shape=(224, 224, 3), weights='imagenet',include_top=False)
feature_extractor.trainable = False


inputs = feature_extractor.input
x = feature_extractor.output
x=GlobalAveragePooling2D()(x)
outputs = tf.keras.layers.Dense(2, activation='softmax')(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
model.compile(optimizer='adam',loss='binary_crossentropy',
    metrics=['accuracy'])
history = model.fit(train_images, validation_data=val_images, epochs=12)


Dense: Dense layer is the deeply connected neural network layer. It is mostly used common layer. Dense layer does the operation on the input and return the output. output = activation (dot (input, kernel) + bias).
Categorical Crossentropy: Categorical cross-entropy is a loss function that is used in multi-class classification use case. These are use cases where an example can only belong to one out of many possible categories, and the model must decide which one.
Adam Optimizer
: Adam(Adaptive Moment Estimation) is an optimization algorithm that can be used instead of the classical stochastic gradient descent procedure to update network weights iterative based on training data.Adam works with momentums of first and second order. The intuition behind the Adam is we don’t want to run so fast just because we can jump over the minimum, we want to decrease the velocity a little for a careful search.



0 comments