How to train your own YOLOv3 detector from scratch

Anton Muehlemann
Insight
Published in
4 min readOct 4, 2019

--

This comprehensive and easy three-step tutorial lets you train your own custom object detector using YOLOv3. The only requirement is basic familiarity with Python.

Our input data set are images of cats (without annotations).

As an example, we learn how to detect faces of cats in cat pictures. Given the omnipresence of cat images on the internet, this is clearly a long-awaited and extremely important feature! But even if you don’t care about cats, by following these exact same steps, you will be able to build a YOLO v3 object detection algorithm for your own use case.

Training Data

If you already have an image dataset, you are good to go and can proceed to the next step! If you need to create an image dataset first, consider using a Chrome extension such as Fatkun Batch Downloader which lets you build your own dataset easily. For instance, if you’d like to detect fidget spinners in images with fidget spinners, do a Google Image search for “fidget spinner” and save the resulting images.

If you just want to learn more about training YOLO v3, you can also use the cat images already contained in the accompanying GitHub repo.

Clone the Repo

Before getting started, clone the repo at github.com/AntonMu/TrainYourOwnYOLO to your local machine. Make sure to set up a virtual environment and install the requirements.

This YOLO tutorial is designed to work for Windows, Mac, and Linux operating systems. To make things run smoothly, it is highly recommended to keep the original folder structure of the cloned GitHub repo. The GitHub repo also contains further details on each of the steps below, as well as lots of cat images to play with.

Step 1: Annotate Images

In order for our detector to learn to detect objects in images, such as cat faces in pictures, it needs to be fed with labeled training data. In our cat example, this means manually labeling cat faces for pictures located in TrainYourOwnYOLO/Data/Source_Images/Training_Images. If you are using your own image dataset, replace the cat images in that folder with your own images. For decent results, label at least 100 objects — the more the better!

To label images, I recommend using Microsoft’s Visual Object Tagging Tool (VoTT) which has release packages for Windows, Mac and Linux available at:

https://github.com/Microsoft/VoTT/releases

For Mac, download and install vott-2.x.x-darwin.dmg; for Windows, download and install vott-2.x.x-win32.exe; and for Linux, download and install vott-2.x.x-linux.snap.

Once installed, create a new project and call it Annotations. Then choose TrainYourOwnYOLO/Data/Source_Images/Training_Images as the Source and Target Connection. Under Export Settings choose Comma Separated Values (CSV) as Provider and hit Save Export Settings. Now, start labeling cat faces by drawing bounding boxes around cat faces.

Labeling cat faces with Microsoft’s VoTT.

Once you have labeled enough cat faces, press CRTL+E to export the project. Inside the TrainYourOwnYOLO/Data/Source_Images/Training_Images, you should now see a new folder called vott-csv-export containing Annotations-export.csv.

Next, navigate to TrainYourOwnYOLO/1_Image_Annotation and run the conversion script to convert the annotations to YOLO format.

python Convert_to_YOLO_format.py

That’s all for image annotation. You are now ready to train your YOLOv3 model.

Step 2: Train your YOLOv3 Model

Before getting started you need to download the pre-trained dark-net weights and convert them to YOLO format. To do this, navigate to TrainYourOwnYOLO/2_Training and run:

python Download_and_Convert_YOLO_weights.py

Once finished, train the detector by running:

python Train_YOLO.py

Depending on your set up, this process can take a few minutes to a few hours. I recommend using a GPU to speed up training. The final weights are saved in TrainYourOwnYOLO/Data/Model_Weights. This concludes the training step and you are now ready to detect objects in new images!

Step 3: Try your Detector

To test your object detector, navigate to TrainYourOwnYOLO/3_Inference and run:

python Detector.py

This will apply your freshly trained YOLOv3 object detector on test images located in TrainYourOwnYOLO/Data/Source_Images/Test_Images. In our example, we detect cat faces in new images of cats. To test the detector on your own images, populate the folder with your own images.

Detected cat faces on test images.

If you made it this far — congratulations! You have successfully trained your own YOLOv3 computer vision model!

To explore more options and to customize your code head over to github.com/AntonMu/TrainYourOwnYOLO. All python scripts above have optional command-line arguments that help you adapt the image detector to your use case and to tweak performance. To list command-line options run:

python <script_name.py> -h

Trying your own project? Leave a comment describing how this story made a difference for you and what you have achieved. If you encounter these pesky bugs, write your question below and I will do my best to answer them.

Are you interested in transitioning to a career in data? Sign up to learn more about the Insight Fellows programs and start your application today.

--

--