Cocoa Health Condition Prediction

Hashwanth Gogineni

Related Listings

Mental illness Predic...

2 comments, 2 reviews , 3 likes
Skin Disease Prediction

0 comments, 1 review , 2 likes

Lyme Disease Detection

0 comments, 1 review , 676 views, 1 like
Census Income Prediction

0 comments, 1 review , 498 views, 1 like

Major Concepts

Models Home » Domain Usecases » Agriculture » Cocoa Health Condition Prediction

Cocoa Health Condition Prediction

Models Status

Model Overview

Cacao

Cacao (Theobroma cacao), popularly known as Cocoa, is a tropical evergreen tree in the Malvaceae family farmed for its delicious seeds. Its scientific name translates to "food of the gods" in Greek.

Cacao is produced commercially throughout the New World tropics and 'western Africa' and 'tropical Asia', where it is native to lowland rainforests of the 'Amazon' and 'Orinoco river' basins.

'Cocoa powder,' 'Cocoa butter,' and 'chocolate' are made from its seeds, known as cocoa beans.

Cacao grows to a height of 6–12 meters (20–40 feet) in the forest understory, generally at the lower end of this range.

Its oblong leathery leaves can grow 30 cm (12 inches) long and are lost and replaced regularly by new leaves that are bright red when young.

Its blooms are either foul-smelling or odourless, and they may be found at any time of year, although they bloom in large numbers twice a year.

These blooms are roughly 1 cm (0.4 inches) tall and wide and grow in bunches directly from the trunk and limbs.

Depending on the kind, they might be white, rose, pink, yellow, or brilliant red and are pollinated by small insects known as midges in various locations.

Black Pod rot

Infection shows as a chocolate brown patch on the pod's surface that quickly spreads and covers the entire surface.

As the illness progresses, a white fungus growth with fungal sporangia appears on the damaged pod surface.

The injured pods eventually turn brown to black.

As a result of infection, the interior tissues and the beans get discoloured.

Cocoa pod borer

The cocoa pod borer causes external damage to the pod in the form of entry and exit holes in the husk made by tunnelling larvae and general premature or uneven ripening (yellowing) of pods caused by internal feeding activities.

Beans typically cling together when pods are cut open due to distinctive tunnels and scarification created by eating.

In severe infestations, harvested beans cluster together and may be hard to separate from damaged pods.

Why Cocoa Health Condition Prediction?

The project can help agriculture organizations and farmers to detect infected Cocoa and eliminate them.

Dataset

The dataset consists of '2,092' Cocoa images of size '1080x1080'. The dataset includes classes 'Healthy' and 'Unhealthy,' having 1,046 images in each of them.

Convolutional Neural Network (CNN)

A 'Convolutional Neural Network' is a deep learning system that can take an input image, assign relevance (weights and biases) to various aspects/objects in the image, and distinguish between them.

Compared to other classification methods, the amount of pre-processing required by a 'ConvNet' is significantly less.

While basic approaches require hand-engineering of filters, ConvNets can learn these filters/characteristics with enough training.

The architecture of a ConvNet is inspired by the organization of the Visual Cortex and is akin to the connectivity pattern of Neurons in the Human Brain. Individual neurons can only respond to stimuli in a small area of the visual field called the Receptive Field. Several similar fields can be stacked on top of each other to span the full visual field.

Understanding Code

First, let us import the required libraries for our project.

import os

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from pathlib import Path

from tensorflow.keras.preprocessing import image_dataset_from_directory  

import tensorflow as tf

import cv2

from keras.layers import Input, Lambda, Dense, Flatten,GlobalAveragePooling2D, Dropout, Activation

from keras.models import Model

from keras.preprocessing.image import ImageDataGenerator

from keras.models import Sequential

from tensorflow.python.keras.preprocessing.image import ImageDataGenerator

from sklearn.metrics import classification_report, log_loss, accuracy_score

Now, let us load the data into our system and convert the data into a dataframe.

image_dir = Path('/content/sample_data/data')



# Get filepaths and labels

filepaths = list(image_dir.glob(r'**/*.jpg'))

labels = list(map(lambda x: os.path.split(os.path.split(x)[0])[1], filepaths))



filepaths = pd.Series(filepaths, name='Filepaths').astype(str)

labels = pd.Series(labels, name='Labels')



# Concatenate filepaths and labels

image_df = pd.concat([filepaths, labels], axis=1)



# Shuffle the DataFrame and reset index

image_df = image_df.sample(frac=1).reset_index(drop = True)



# Show the result

image_df.head()

As you can see, we extracted data from the data's directory and concatenated 'filepaths' and 'labels' into a dataframe.

Let us also split the dataframe for testing and training purposes.

# Separating train and test data

from sklearn.model_selection import train_test_split



train_df, test_df = train_test_split(image_df, train_size=0.85, shuffle=True, random_state=1)

As you can see, I used the "train_test_split" function to split the dataframe.

train_datagen = ImageDataGenerator(rescale=1./255, validation_split=0.15)



test_datagen = ImageDataGenerator(rescale=1./255)

As you can see, I used the 'ImageDataGenerator' function for data augmentation purposes.

train_images = train_datagen.flow_from_dataframe(

    dataframe=train_df,

    x_col='Filepaths',

    y_col='Labels',

    target_size=(250, 250),

    color_mode='rgb',

    class_mode='categorical',

    batch_size=64,

    shuffle=True,

    seed=42,

    subset='training'

)



val_images = train_datagen.flow_from_dataframe(

    dataframe=train_df,

    x_col='Filepaths',

    y_col='Labels',

    target_size=(250, 250),

    color_mode='rgb',

    class_mode='categorical',

    batch_size=64,

    shuffle=True,

    seed=42,

    subset='validation'

)



test_images = test_datagen.flow_from_dataframe(

    dataframe=test_df,

    x_col='Filepaths',

    y_col='Labels',

    target_size=(250, 250),

    color_mode='rgb',

    class_mode='categorical',

    batch_size=32,

    shuffle=False

)

Also, I loaded train and test data using the 'flow_from_dataframe' function into the kernel.

Next, let us get into the modelling part of the project.

from tensorflow.keras.layers import Dense, Conv2D, Dropout, Flatten, MaxPool2D



model = Sequential()



model.add(Conv2D(32, (3, 3), input_shape = (250, 250, 3), padding="same", activation = 'relu', data_format = 'channels_last'))

model.add(Conv2D(32, (3, 3), activation='relu', padding="same"))

model.add(MaxPool2D(pool_size=(3, 3)))



model.add(Conv2D(64, (3, 3), activation='relu', padding="same"))

model.add(Conv2D(64, (3, 3), activation='relu', padding="same"))

model.add(MaxPool2D(pool_size=(3, 3)))



model.add(Conv2D(128, (3, 3), activation='relu', padding="same"))

model.add(Conv2D(128, (3, 3), activation='relu', padding="same"))

model.add(MaxPool2D(pool_size=(3, 3)))



model.add(Flatten())

model.add(Dropout(0.25))

model.add(Dense(256, activation='relu'))

model.add(Dense(2, activation='softmax'))



model.summary()

Here I used 6 'Conv2D' layers, 3 'MaxPool2D' layers and 1 'flatten' layer, 1 'Dropout' layer, 2 'Dense' layers to get the best out of our data.

Now, let us compile our model and fit the data.

model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])



callback = tf.keras.callbacks.EarlyStopping(monitor='accuracy', patience=2)



history = model.fit_generator(train_images, validation_data=val_images, epochs=50, callbacks=callback)

As you can see, I used 'categorical_crossentropy' and 'accuracy' as accuracy metrics.

 get_acc = history.history['accuracy']

 value_acc = history.history['val_accuracy']

 get_loss = history.history['loss']

 validation_loss = history.history['val_loss']



 epochs = range(len(get_acc))

 plt.plot(epochs, get_acc, 'r', label='Accuracy of Training data')

 plt.plot(epochs, value_acc, 'b', label='Accuracy of Validation data')

 plt.title('Training vs validation accuracy')

 plt.legend(loc=0)

 plt.figure()

 plt.show()

Finally, I used 'matplotlib.pyplot' to generate our model's performance's graph.

Also, let us have a look at our model's classification report.

# Classification Report



from sklearn.metrics import classification_report



test_labels=test_images.classes

predictions=model.predict_generator(test_images, verbose=1)

y_pred = np.argmax(predictions, axis=-1)

print(classification_report(test_labels, y_pred))

Here in the report '0' represents 'Healthy' and '1' represents 'Diseased'

Thank you for your time

0 comments

Maryam Bains likes this

Related Listings

Hashwanth Gogineni's other Models Reports

Major Concepts

Cocoa Health Condition Prediction

Models Status

Model Overview

Cacao

Black Pod rot

Cocoa pod borer

Why Cocoa Health Condition Prediction?

Dataset

Understanding Code

Deployment

Photos

Reviews

Connect With Us

Member Sign In

Member Sign In

Create Account

Related Listings

Hashwanth Gogineni's other Models Reports

Major Concepts

Cocoa Health Condition Prediction

Models Status

Model Overview

Cacao

Black Pod rot

Cocoa pod borer

Why Cocoa Health Condition Prediction?

Dataset

Understanding Code

Deployment

Photos

Reviews

Connect With Us