Using Mask R-CNN with a Custom COCO-like Dataset
One of the coolest recent breakthroughs in AI image recognition is object segmentation. That's where a neural network can pick out which pixels belong to specific objects in a picture. In this tutorial, you'll learn how to use the Matterport implementation of Mask R-CNN, trained on a new dataset I've created to spot cigarette butts.
Not a beginner tutorial...
This is not intended to be a complete beginner tutorial.
- You should know how to clone a git repository from GitHub
- You should have a decent understanding of Python programming
- You should be familiar with Jupyter notebooks
- You should understand the basics of training deep neural networks. If you don't know what "hyperparameters" or "epochs" are, I wouldn't recommend starting here. Find a beginner deep learning tutorial/course and start there. There's a lot of great free stuff out there if you search for it.
The Original Project
You can get the tutorial code on GitHub: https://github.com/akTwelve/tutorials/blob/master/mask_rcnn/MaskRCNN_TrainAndInference.ipynb
You will also need the Mask R-CNN code. I linked to the original Matterport implementation above, but I've forked the repo to fix a bug and also make sure that these tutorials don't break with updates.
Get the Mask R-CNN code here: https://github.com/akTwelve/Mask_RCNN
Cigarette Butt Dataset
I'm sharing a dataset I created from scratch. It is COCO-like, meaning it is annotated the same way that the COCO dataset is. Read more at http://cocodataset.org/#home.
I'm also sharing some trained weights that I created using the code from this tutorial.
Download the dataset and trained weights from here: https://www.immersivelimit.com/datasets/cigarette-butts
Grab the code resources and dataset mentioned above and start with the MaskRCNN_TrainAndInference.ipynb Jupyter Notebook. That will walk you through the rest of the tutorial. This is my first time sharing a dataset, so whether you have trouble or not, feedback is greatly appreciated!
Results to Expect
Your results may vary, depending on where you take your pictures from. If the ground looks different where you live from where I took my training photos, you may get more false positives. You'll notice that mine spots leaves and sometimes thinks they're cigarette butts. It is also easily confused by other cylindrical objects. This project is more of a proof of concept than a perfect solution, so I encourage you to do more experimenting on your own!
If you want to learn how to create your own COCO-like dataset, check out other tutorials on Immersive Limit.