Radiologists use CT (Computed Tomography) and MRI (Magnetic Resonance Imaging) to diagnose and stage cancers. With recent advances in machine learning research in oncology, these images are used to train and automatically detect cancerous tissues. These scans are obtained at a pixel spacing of 0.5mm-1.5mm, which can result in volumes that range from 128x128x128 to 512x512x512.
One of the common tasks on these datasets is to segment regions of interest such as tumors/cysts on specific organs. If these masses are small enough, then using a lower resolution image can lead a model to ignore those sections. A large number of researchers work on models that can accurately segment out tumors in these medical scans. The biggest constraint for these kinds of models and high resolution datasets is memory resources.
Most architectures that support machine learning applications run out of memory due to the large size of intermediate activation maps in the network. Using data or model parallelism does not solve this problem. Therefore, the most common alternative used in these cases is downsampling the volumes to a lower voxel space or training models on smaller crops of the volume. This can lead to a loss of contextual cues and therefore suboptimal performance on the dataset.
We hypothesize that training models on the true resolution of the dataset will help achieve better accuracy.
In our experiments, we use the open source dataset for kidney tumor segmentation.
To demonstrate the effect of increasing patch sizes on the accuracy of the model, here are a few visualizations: