Laparoscopic Image to Image Translation

Sample Translation Sample Translation

Description

In this project, we generate synthetic images in a 3D environemnt, roughly resembling laparoscopic liver surgery scenes. We then train a group of Generative Adversarial Networks (GAN) to translate these images to look like real laparoscopic images. After the training process, we can use the translated images along with their labels as training data for a certain target task. The data sets as well as the code to generate them are made publically available (see below).

Publications

If you use our data or code, please cite the following paper:

"Generating large labeled data sets for laparoscopic image processing tasks using unpaired image-to-image translation". Micha Pfeiffer, Isabel Funke, Maria R. Robu, Sebastian Bodenstedt, Leon Strenger, Sandy Engelhardt, Tobias Roß, Matthew J. Clarkson, Kurinchi Gurusamy, Brian R. Davidson, Lena Maier-Hein, Carina Riediger, Thilo Welsch, Jürgen Weitz, and Stefanie Speidel. MICCAI 2019.

Supplementary material

Supplementary material giving more details on network architecture and various examples of translation results can be found here.

Data

This data set is a result of the method described in our paper. We split it into separate archives so that everyone can download the labels they need. In each archive, you will find subfolders for each patient. Folder names are the patient pseudonyms (pseudonyms taken from the original IRCAD 3D CT liver data set which was used to generate the 3D scenes).

In our experiments, pre-training networks on this synthetic data improved final perfomance of the networks, as well as their ability to generalize. Please note: Some of the labels have been successfully used to (pre-)train networks, others haven't been tested yet. If you use the data, we would be happy to get some feedback about how you used it and whether it helped in making your model better (see contact below)!

Name Details Comments
Image drawn from a Input Images A+, 20 000 images These are the input images in the simulated domain which we gave to the translator. Usually not required, unless you want to train your own translator.
Image drawn from a Translations, Random stlye Bsyn, Random style vectors, 100 000 Images Translations of the Input Images. Each input image was translated five times, each time with a new randomly sampled style. These kind of images were used to obtain the results in our paper.
Image drawn from a Translations, Cholec80 style Alternative translation. Style vectors extracted from Cholec80, 100 000 Images Translations of the Input Images. Each input image was translated five times, each time with a new style extracted from a random Cholec80 image b. At this point, we cannot say whether this data set or the random style data set is better for training networks. You may want to try both, or even mix the two to increase diversity in style.
Image drawn from a Labels: Segmentation Full segmentation masks Segmentation mask for every image in A+. Classes are: Liver, Fat, Abdominal Wall, Tool Shaft, Tool Tip, Gallbladder
Image drawn from a Labels: Depth Depth maps Depth map for every image in A+.
Ridge Labels: Ridges Ridge Lines Liver ridge lines for every image. The ridges were extracted semi-automatically from the 3D liver meshes. Afterwards, they were rendered into the camera view to obtain these images. As the ridges follow surface edges, they may not be smooth everywhere. Please use post-processing like thickening and smoothing where necessary.
World space normal Labels: Normals Normals Surface normals in world space. Please use the camera information to transform these to camera space if needed.
Labels: Camera Information (Coming soon) Coming soon Coming soon
Please note that there are many combinations of hyper-parameters which we have not tested yet (such as various weights for the MS-SSIM loss) and which may result in more realistic translations. Please feel free to experiment with our code and contribute back if you get any new results!

Code

The code for this project can be found on GitLab.

It is based on the Multimodal UNsupervised Image-to-image Translation (MUNIT) framework: Code, Paper

Contact

For questions and comments, please contact us at micha.pfeiffer [at] nct-dresden.de

Links

OpenCAS: Open collection of datasets for computer-assisted surgery systems

NCT Dresden University Clinic Dresden German Cancer Research Center