Zhenyu Zhang, Shouwei Gao and Zheng Huang*
Background: Due to the significant variances in their shape and size, it is a challenging task to automatically segment gliomas. To improve the performance of glioma segmentation tasks, this paper proposed a multilevel attention pyramid scene parsing network (MLAPSPNet) that aggregates the multiscale context and multilevel features.
Methods: First, T1 pre-contrast, T2-weighted fluid-attenuated inversion recovery (FLAIR) and T1 post-contrast sequences of each slice are combined to form the input. Afterward, image normalization and augmentation techniques are applied to accelerate the training process and avoid overfitting, respectively. Furthermore, the proposed MLAPSPNet that introduces multilevel pyramid pooling modules (PPMs) and attention gates is constructed. Eventually, the proposed network is compared with some existing networks.
Results: The dice similarity coefficient (DSC), sensitivity and Jaccard score of the proposed system can reach 0.885, 0.933 and 0.8, respectively. The introduction of multilevel pyramid pooling modules and attention gates can improve the DSC by 0.029 and 0.022, respectively. Moreover, compared with Res-UNet, Dense-UNet, residual channel attention UNet (RCA-UNet), DeepLab V3+ and UNet++, the DSC is improved by 0.032, 0.026, 0.014, 0.041 and 0.011, respectively.
Conclusion: The proposed multilevel attention pyramid scene parsing network can achieve state-of-the-art performance, and the introduction of multilevel pyramid pooling modules and attention gates can improve the performance of glioma segmentation tasks.
Gliomas, segmentation, magnetic resonance imaging, MLAPSPNet, attention gates, feature fusion, context.
School of Mechatronic Engineering and Automation, Shanghai University, Shanghai, School of Mechatronic Engineering and Automation, Shanghai University, Shanghai, State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang