Create Your own Image Dataset using Opencv in Machine Learning

Hemant Sharma 26 Aug, 2021 • 5 min read

Hello Geeks! In this article, we are going to prepare our personal image dataset using OpenCV for any kind of machine learning project. For example, we have prepare data for Rock…..Paper….Scissor game. So lets get started…. 😉

Introduction

Machine learning and images have a great relationship, the image classification has been one of the main roles of machine learning over the years. It has been of great use during the COVID-19 pandemic to recognize people who are not following the rules like wearing masks and maintaining distance.

Pre-requisites

Every program has some pre-requisites to resolve problems related to the environment. We are here building a dataset for a Machine Learning project, the minimal requirement for this is a machine with python3 installed and a module of OpenCV on it.

Python3
Opencv

I am using Jupyter Notebook in my system. If you also want to use the same setup you have to install Anaconda on your machine and then install OpenCV.

Install OpenCV

To install OpenCV, open the command prompt if you are not using anaconda. Otherwise open anaconda-prompt from windows search and type the below-given command.

pip install opencv-python=3.4.2.17

Now you are all set to code and prepare your dataset.

Steps Involved

Here we are going to cover all the steps involved in creating this program.

Step 1: Import Modules

First, we have to import all the required modules into the program console. We only need two modules, one is the “OpenCV” and the other is the “os” module. Opencv is used to capture and render the image using the laptop camera and the os module is used to create a directory.

import cv2 as cv
import os

Step 2: Create Camera Object

As we have to create our own image dataset, we need the camera, and OpenCV helps us to create camera objects that can be used later for various actions.

#argument 0 is given to use the default camera of the laptop
camera = cv.VideoCapture(0)
#Now check if the camera object is created successfully
if not camera.isOpened():
    print("The Camera is not Opened....Exiting")
    exit()

Step 3: Create Label Folders

Now, we need to create folders for each label for the sake of differentiation. Use the below-given code for creating these folders, you could add as many labels as you want. We have given our label names according to the game rock, paper, scissors. We are preparing a dataset that could classify the image if it is a rock or paper or scissor or just a background.

#creating a list of lables "You could add as many you want"
Labels = ["Background","Stone","Paper","Scissors"]
#Now create folders for each label to store images
for label in Labels:
    if not os.path.exists(label):
        os.mkdir(label)

Step 4: Final step to capture images

This is the final and most crucial step of the program. Inline comments have been written to make it easier to understand. Here we have to capture images and store those images according to the label folder. Read the code thoroughly we have mentioned each little thing here.

for folder in Labels:
    #using count variable to name the images in the dataset.
    count = 0
    #Taking input to start the capturing
    print("Press 's' to start data collection for"+folder)
    userinput = input()
    if userinput != 's':
        print("Wrong Input..........")
        exit()
    #clicking 200 images per label, you could change as you want.    
    while count<200:
        #read returns two values one is the exit code and other is the frame
        status, frame = camera.read()
        #check if we get the frame or not
        if not status:
            print("Frame is not been captured..Exiting...")
            break
        #convert the image into gray format for fast caculation
        gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
        #display window with gray image
        cv.imshow("Video Window",gray)
        #resizing the image to store it
        gray = cv.resize(gray, (28,28))
        #Store the image to specific label folder
        cv.imwrite('C:/Users/HP/Documents/AnacondaML/'+folder+'/img'+str(count)+'.png',gray)
        count=count+1
        #to quite the display window press 'q'
        if cv.waitKey(1) == ord('q'):
            break
# When everything done, release the capture
camera.release()
cv.destroyAllWindows()

Practical Implementation

Now, run the program to create the dataset. We will first provide the background, then stone, paper and scissors. Before implementation you should always be clear about what you have coded and how the output will help you to resolve the use-case requirement. So, lets do it…

Run the program all at once

We are using jupyter notebook to run this program, you could use any python interpreter. First, go to the cell menu and click on “Run All” this will run all the cells available in one stroke.

Now, an input prompt will be raised, press ‘s’ and hit enter to start saving images for the background.

After pressing ‘s’, it is going to capture 200 images of the background. The display window will appear and start capturing the images, so get out of the frame and allow the camera to capture the background.

Now, it is going to ask for ‘s’ and capture “stone” images. So, close your fist and show it to the camera in several positions.

Note: Only move your hand with fist close, do not fix your hand in one position to produce a well-labelled dataset.

Now, repeat the same process for paper and scissors images. Do not forget to press ‘s’ when asked, otherwise, it gonna look like the display window is stuck, but it is not.

The program will automatically close. Now you could check by browsing if the dataset is created or not.

Note: The image dataset will be created in the same directory where the python program is stored. Four directories will be created according to the label allocated to them.

Yes, the folders have been created successfully, now check if the images have been captured and saved. The image size will not be the same as you were seeing during the capturing process. We have reduced the image size so that when it is used in a machine learning project to train the model it takes fewer resources and time.