A Two-Stage U-Net Model for 3D Multi-class Segmentation on Full-Resolution Cardiac Data
Wang C., MacGillivray T., Macnaught G., Yang G., Newby D.
Deep convolutional neural networks (CNNs) have achieved state-of-the-art performances for multi-class segmentation of medical images. However, a common problem when dealing with large, high resolution 3D data is that the volumes input into the deep CNNs has to be either cropped or downsampled due to limited memory capacity of computing devices. These operations can lead to loss of resolution and class imbalance in the input data batches, thus downgrade the performances of segmentation algorithms. Inspired by the architecture of image super-resolution CNN (SRCNN), we propose a two-stage modified U-Net framework that simultaneously learns to detect a ROI within the full volume and to classify voxels without losing the original resolution. Experiments on a variety of multi-modal 3D cardiac images have demonstrated that this framework shows better segmentation performances than state-of-the-art Deep CNNs with trained with the same similarity metrics.