Assessing Self-Segmentation for EBRT Planning Structures Using a Deep Learning Based Cervical Cancer Workflow



The work flow chart for this study is shown in Fig. 1. Briefly, the assessment has been divided into 3 sections. Section 1, the accuracy of DL-based auto-segmentation was assessed using geometric metrics. Section 2, the dosimetric comparison was performed between the standard manual contours and the self-segmented contours of the original EBRT plans. In section 3, correlation analysis was explored, followed by geometric and dosimetric measurements.

Figure 1

The flowchart of the manual and DL-based automatic segmentation evaluation experiment. The original EBRT plans were designed and optimized based on the standard manual contours and the auto-segmentation structures were transmitted to the original EBRT plans for dosimetric evaluation.

Clinical Datasets

The independent cohort of this study consisted of 75 patients with cervical cancer who received EBRT in our department between August 2021 and December 2021. All patients were diagnosed with FIGO stage IA2-IVB and G1-G3 histology, treated with a prescription dose of 45 Gy-50.4 Gy (1.8 Gy/fraction). The mean ± standard deviation age of these patients was 55.60 ± 13.35 years. For each patient, contrast agent had to be injected intravenously before computed tomography (CT), while CT images were covered from the lower lumbar spine to the entire pelvic cavity and reconstructed with a matrix size of 512 × 512 and 5 mm slice thickness using a Philips Brilliance Big Bore CT system (Philips Healthcare, Best, The Netherlands).

The delineation of the CTVs of 75 patients was defined manually by junior radiation oncologists, including the entire cervix, uterus, bilateral parameters, upper half of the vagina and lymph nodes, according to the Radiation Therapy Oncology Group (RTOG) protocol guidelines18. Relevant OARs included for EBRT plans were spinal cord, left kidney (L kidney), right kidney (R kidney), bladder, left femoral head (L femoral head), right femoral head (R femoral head ), pelvic bone, rectum and small intestine. EBRT planning frameworks were performed on the Pinnacle Treatment Planning System (Pinnacle, V9.16.2, Philips Corp, Fitchburg, WI, USA). All manual contours have been reviewed and approved by experienced radiation oncologists specializing in cervical cancer to generate the standard delineation.

Automatic segmentation based on Deep Learning

We introduced a deep learning model based on CNN19 to segment CTVs and OARs for cervical cancer patients. As shown in Fig. 2, the network consists of three encoders and three decoders. The InProj was used to extract features from the medical image, and the OutProj performed the per-pixel classification. Downsampling and oversampling were performed by each encoder and decoder. All 2D convolution weighting filters (Conv2d) had a window size of 3 × 3 and a stride of 1. Batch normalization (BN) was a process by which the output distribution was biased and used for the normalization of features. For this network, the rectified linear unit (ReLu) followed by each Conv2d was used as the feature activation function. Max Pooling could reduce the number of parameters and calculations in the network. ConvTranspose2d was the opposite of that used for Conv2d, in which the pixel size is increased using a 3×3 pixel filter. The jump connection was used to concatenate the encoder and decoder of the same level to facilitate the merging of multi-layered functionality. We used some general data enhancement methods (cut and flip) to get a superior model. This model is an end-to-end segmentation architecture that can predict pixel class labels in CT images.

Figure 2
Figure 2

DL-based automatic segmentation network architecture.

A total of 300 retrospective clinical CT scans diagnosed with cervical cancer who received radiation therapy were enrolled for training and validation of this model, and datasets were sourced from multiple cancer centers to verify robustness. of the CNN model. Cross-entropy loss was selected as the loss function, and all training calculations were performed using an Intel-Core i7 processor with a graphics card.

Geometric Metrics

The geometric accuracy of contours was compared using dice similarity coefficient (DSC), 95% Hausdorff distance (HD) and Jaccard coefficient (JC). DSC and JC describe the relative overlap between segmentations A and B. HD is used to quantify the 3D distance between two segmentation surfaces. The 95% HD is the distance that indicates the greatest surface-to-surface separation among the closest 95% surface points. The definitions are as follows:

$$begin{aligned} & DSC = 2left| {A cap B} right|/(left| A right| + left| B right|) & HD = max (h(A,B),h(B,A)), ;h(A,B) = mathop {max }limits_{b in B} (mathop {min }limits_{a in A} left| {a – b} right |) & JC = left| {A cap B} right|/left| {A cup B} right| end{aligned}$$

For full overlap, the value of HD is 0 and the values ​​of DSC and JC are 1. For incomplete overlap, the value of HD is large and the values ​​of DSC and JC are close to 0. In order to verify the performance of DL-based pattern recognition in the segmentation boundary, no upper or lower boundary cropping for contours was performed for this study, especially in the spinal cord, femoral head, and pelvic bone.

Dosimetric metrics

EBRT plans were calculated and optimized with these standard manual contours using the Pinnacle treatment planning system. Table 1 presents the dosimetric constraints and metrics. For CTV, we mainly focused on Dmean and V100%. For serial organs and parallel organs, we mainly focused on Dmaximum and Dmean, respectively. Dmean and Dmaximum are defined as the average dose and the maximum dose of the receiving structures. V100 is defined as the volume of CTV receiving 100% of the prescribed dose.

Table 1 Constraints and dosimetric metrics for EBRT development works.

statistical analyzes

IBM SPSS Statistics software (version 19.0, IBM Inc., Armonk, NY, USA) and Python software (version 3.6.5, Anaconda Inc.) were used for statistical analysis, where the mean ± standard deviation (SD) was used for presenting and summarizing the results. For the concordance test between the manual and DL-based methods, the Bland-Altman test was used to calculate the consistent bounds for each EBRT planning structure. P> 0.05 means agreement of two segmented methods. For difference, Wilcoxon’s paired nonparametric signed rank test was performed to compare variables. P


Comments are closed.