Matthew Johnson

Semantic Texton Forests

The system works by processing an image through multiple trees and combining the result to obtain a segmentation.

My thesis work with Semantic Texton Forests started as a CVPR 2008 paper with Jamie Shotton. That paper was selected for an oral presentation, and our demo won the best demo prize. Sadly for you, that code is highly optimized, proprietary and not the program available for download here. In order that the technique can be better understood and to encourage use by students and researchers, I have endeavored to make available a more general purpose, open source reference implementation version of the technique. The link to download the source code can be found below.

This C# code includes the routines for training and testing the semantic texton forest image segmentation system from our paper. You are able to see exactly how it works, and to change the parameters to see how they effect performance. The code is currently configured to work with the MSRC21 dataset, but will work with any image dataset with pixel-level ground truth labels in the same basic format. There are a good number of comments throughout, and I have tried to make the code as easily readable as possible. Please feel free to contact me with any questions or requests for clarification.

I would suggest that you download the MSRC21 files and get the system working with them as a first step. You will need to edit the constants in Program.cs to reflect your own local setup. If you run the system with the default settings it currently has, you should achieve an overall accuracy of about 0.65 and an average accuracy of about 0.55 on the dataset. Once you have it configured correctly and working with the MSRC21 data, it should be straightforward to incorporate other datasets, though you will first have to convert them to follow the same basic format (raw images and pixel-level ground truth labels).

A sample segmentation from the system.

It is straightforward to change this code (currently configured as an executable) into a library for inclusion in your own research, though it depends very heavily on the Vision.NET and SVM.NET libraries, available for download on this website. I humbly ask that anyone who uses this code in their research will cite our paper or my thesis in any resulting publications. I am making it available under the GNU Public License, which can be viewed either in the included "license.txt" file, or on the web.

It should be stressed that the purpose of this implementation is not to show off the algorithm's speed and efficiency (though it is quite quick) but rather to demonstrate the principles by which it works. Wherever possible, I have sacrificed optimization in favor of clarity, and apologize ahead of time to those who desire an incredibly efficient implementation of the technique. It would not be too difficult to parallelize this implementation and thus better utilize modern multi-core computers, but I leave that as an exercise to the reader.