Dataset and preprocessing
For this project, the dataset “Multimodal Brain Tumor Segmentation Challenge 2015” is used. This dataset contains 3D magnetic resonance imaging (MRI) scans from 276 patients with brain tumors. For each patient there are four different types of 3D scans, or modalities; Flair, T1, T1c, and T2. Each is treated as different channels, which one would normally do with the different colors in an RGB image. For each patient, there’s also a ground truth segmentation mask, that includes four different tumor tissue types.
For this project, a few 2D slices are used in the transverse plane around the location segmented, rather than using the full 3D scans. Instead of seeing the slices as 3D volume, they are stacked together as channels, effectively getting 4*N channels, where 4 comes from the number of modalities, and N from the number of slices used. By scaling down to 2.5D, a great deal of GPU memory can be saved, which allows for training with a larger batch size. An alternative approach would be to use 3D convolutions, as used by Kamnitsas et al, which may capture more structure in the z-dimension.
For preprocessing, bias field correction and histogram equalization are done for each volume. Bias field signal is a low-frequency artifact that tends to be present in MRI images due to imperfections in the coils used in the MRI scans, or interference between scan slices, among other things. The N4 Bias Field Correction algorithm is used for handling this. Another characteristic of MRI images is that there is no fixed scale for intensity; scanning the same patient in different machines will give different absolute values, even though the semantic information is the same. To mitigate this, a histogram equalization is performed.
During training, data augmentation is performed; rotation between -10 and 10 degrees, zoom between 90 and 110%, shearing up to 5 degrees, and horizontal flips. In the U-Net paper, elastic deformation is also used, which probably could benefit this case as well.