Using Mask R-CNN with a Custom COCO-like Dataset


One of the coolest recent breakthroughs in AI image recognition is object segmentation. That's where a neural network can pick out which pixels belong to specific objects in a picture. In this tutorial, you'll learn how to use the Matterport implementation of Mask R-CNN, trained on a new dataset I've created to spot cigarette butts.

Not a beginner tutorial...

This is not intended to be a complete beginner tutorial.

  • You should know how to clone a git repository from GitHub
  • You should have a decent understanding of Python programming
  • You should be familiar with Jupyter notebooks
  • You should understand the basics of training deep neural networks. If you don't know what "hyperparameters" or "epochs" are, I wouldn't recommend starting here. Find a beginner deep learning tutorial/course and start there. There's a lot of great free stuff out there if you search for it.

The Original Project

Training an AI to Recognize Cigarette Butts

I originally posted about this project in this blog post. In it, you can read about how I created the dataset and the kind of results I got.

CODE Needed

Tutorial Code

You can get the tutorial code on GitHub:

Mask R-CNN

You will also need the Mask R-CNN code. I linked to the original Matterport implementation above, but I've forked the repo to fix a bug and also make sure that these tutorials don't break with updates.

Get the Mask R-CNN code here:


Cigarette Butt Dataset

I'm sharing a dataset I created from scratch. It is COCO-like, meaning it is annotated the same way that the COCO dataset is. Read more at

I'm also sharing some trained weights that I created using the code from this tutorial.

Download the dataset and trained weights from here:


Get Started

Grab the code resources and dataset mentioned above and start with the MaskRCNN_TrainAndInference.ipynb Jupyter Notebook. That will walk you through the rest of the tutorial. This is my first time sharing a dataset, so whether you have trouble or not, feedback is greatly appreciated!

Results to Expect

Your results may vary, depending on where you take your pictures from. If the ground looks different where you live from where I took my training photos, you may get more false positives. You'll notice that mine spots leaves and sometimes thinks they're cigarette butts. It is also easily confused by other cylindrical objects. This project is more of a proof of concept than a perfect solution, so I encourage you to do more experimenting on your own!



Related Content

If you want to learn how to create your own COCO-like dataset, check out other tutorials on Immersive Limit.

Create COCO Annotations from Scratch

This tutorial will teach you how to create a simple COCO-like dataset from scratch. It gives example code and example JSON annotations.


Composing Images with Python for Synthetic Datasets

Learn how to compose images with Python for synthetic datasets. Full code on GitHub.

Thanks for reading! If you liked the post and want to see more like it, please follow Immersive Limit on Facebook and @ImmersiveLimit on Twitter.