Unity Image Segmentation

unity_segmentation_field.jpg

Object Segmentation Masks

In this tutorial, we’re going to create synthetic object segmentation images with the Unity game engine. Object segmentation means each object gets its own unique color and all pixels with that color are part of that particular object in the original image. Typically there is an original real image as well as another showing which pixels belong to each object of interest. There will also be some sort of annotation that describes which color belongs to which object.

The original The RGB image we’ll be making in this tutorial.

The original The RGB image we’ll be making in this tutorial.

The segmentation image we’ll be making in this tutorial. (Each color is unique in this image)

The segmentation image we’ll be making in this tutorial. (Each color is unique in this image)

Another segmentation image where white cubes represent the default Layer and pink cubes are on Layer 3. The code provided supports both versions (just not at the same time).

Another segmentation image where white cubes represent the default Layer and pink cubes are on Layer 3. The code provided supports both versions (just not at the same time).

Rather than reinvent the wheel, I’ve modified code from Unity’s ML-ImageSynthesis project to perform object segmentation.

Get the GitHub Repo

A sample Unity project repository can be found here: https://github.com/immersive-limit/Unity-ComputerVisionSim

Open the Unity Project

The Sample Scene has only three objects in it. A Main Camera, with an ImageSynthesis script attached, a Directional Light, and a CubeField, with a CubeField script attached.

Press Play

Things get more interesting! You should see what I decided to call a “cube field” animating along the Y-axis via sine waves on X and Z. This is simply for us to visualize something interesting.

A sine wave animated field of cubes that will appear when you press Play.

A sine wave animated field of cubes that will appear when you press Play.

Change to Display 2

If you change the display in the Game window to “Display 2” it will show your object segmentation image.

Note: Colors change each time the play button is pressed. This is because colors are generated based on their object id, which is new every play.

Simulated object segmentation image in Unity. Each object id has its own unique color.

Simulated object segmentation image in Unity. Each object id has its own unique color.

Display 3 will show objects segmented by  layer , rather than object id.

Display 3 will show objects segmented by layer, rather than object id.

Check Out the Camera

Clipping Planes

The only thing I’ve changed on this camera is the Clipping Planes. If your clipping planes are set to default values of .3 and 1000, your depth map will be black at .3 meters and white at 1000 meters and beyond. Imagine stretching out a gradient over 1 kilometer and you’ll see why this isn’t ideal for us. A real depth sensor has a range of about 10 meters and it doesn’t work very well up close. You can choose to simulate your actual ranges for your project, but for this example, a Near plane of 5m and Far plane of 20m fits the cube field nicely.

clipping_planes.jpg

Image Synthesis - Inspector Setup

I’ve made a few additions to the original Image Synthesis script.

  1. A list of checkboxes that allow you to choose which image captures you want to save. (Optical Flow doesn’t work yet)

  2. Filepath - If you leave this empty, the captures will save to your Unity project directory (next to the Assets folder).

  3. Filename - The code will take your filename and insert the capture pass name for each pass, (e.g. test_img.png, test_depth.png)

image_synthesis_inspector.JPG

When you press play, a “Save Captures” button will appear. This will save your selected captures to your filepath.

save_captures_button.jpg

When you press “Save Captures”, it will save all checked captures to the filepath you specified (relative to the Unity project folder). You will notice some other captures in here, but we’re only interested in test_id.png and test_layer.png for now. Check out the Depth Camera Simulation tutorial for more about that.

All five captures saved in the Captures directory.

All five captures saved in the Captures directory.

Image Synthetsis

ImageSynthesis.cs contains the logic for setting up all camera capture passes. Here’s the high level overview of how it works:

  1. It creates a list of CapturePass structs containing basic info about each pass (e.g. “_img”, “_id”)

  2. In Start(), it assigns the camera to capturePasses[0], then creates a hidden camera for the remaining capturePasses

  3. In OnCameraChange(), it copies settings from the main camera and assigns a target display for each hidden camera

  4. It calls SetupCameraWithReplacementShader() for each hidden camera to tell the shader what type of _OutputMode to use

  5. In OnSceneChange(), it sets the _ObjectColor and _CategoryColor via MaterialPropertyBlock. These are used for segmentation.

  6. In Save(), it does the work to save each of the capture passes to the specified directory

Turn Off HDR and Anti-Aliasing

For segmentation, it is extra important to disable HDR (High Dynamic Range) and MSAA (Multisample Anti-Aliasing). To accomplish this, we can add two lines when we set up our camera.

        cam.allowHDR = false;
        cam.allowMSAA = false;
disableHDRandMSAA.JPG
HDR and MSAA  enabled , notice the smoothing happening at the edges. This is bad for segmentation.

HDR and MSAA enabled, notice the smoothing happening at the edges. This is bad for segmentation.

HDR and MSAA  disabled , notice the hard edges. This is good for segmentation.

HDR and MSAA disabled, notice the hard edges. This is good for segmentation.

Uber Replacement Shader

UberReplacement.shader is where the rendering magic happens. The shader is pretty long, but we really only need to understand part of the first SubShader.

  1. The vert() function computes the depth and passes it to frag()

  2. The frag() function passes that depth info to Output() and returns the value

  3. Look at the Output() function’s ObjectId section ( _OutputMode == 0) and CategoryId section (_OutputMode == 1), where you will see that it simply returns the _ObjectColor or _CategoryColor that was passed in. So if the object color is a shade of blue, it will set every pixel of that object to that exact shade of blue.

  4. This depth calculation is returned for each vertex in the image (via the frag() function)

If shaders aren’t your strong suit, here are a couple excellent resources to bring you up to speed:

Shaders 101 - Intro to Shaders: https://www.youtube.com/watch?v=T-HXmQAMhG0

Shaders 102 - Basics of Image Effects: https://www.youtube.com/watch?v=kpBnIAPtsj8

Shaders 103 - Using Replacement Shaders: https://www.youtube.com/watch?v=Tjl8jP5Nuvc

Catlike Coding - Shader Fundamentals: https://catlikecoding.com/unity/tutorials/rendering/part-2/

Image Saver

This is a simple script that adds an image capture button to the inspector window for ImageSynthesis when in play mode. It’s more intended as a guide than as something I’d expect you to use in your own project. You’d probably want to automate saving images.

Conclusion

Here’s our output image from the object id segmentation pass. Hopefully this tutorial gave you a good idea of how to get depth working in Unity. In a future tutorial, I plan to use this depth data to train a neural networks on segmented images.

test_id.jpg