Multivariate analysis in pediatric brain tumor

doi:10.15406/ijrrt.2017.02.00045

Brain tumors in children are life-threatening and deserve more research to improve patient care. In recentyears, multivariate analysis has been increasingly used in tumor classification (segmentation) and survival (outcome) assessment in childhood brain tumors. This paper reviewed the studies that applied multivariate analysis to tumor classification (segmentation) and survival (outcome) assessment in pediatric brain tumors. Large variations in the tumor classification results were observed in the studies of tumor classification (even in similar patient populations). Moderate error rate in the multivariate survival analysis model was also observed, which could lead to inaccurate survival estimates and misidentification of prognostic factors. To address these problems, this paper analyzed the data processing chains in these multivariate analyses in detail. It seems that optimizing and standardizing these data processing chains may improve tumor classification and survival analysis, reduce variations and errors in classification results and survival estimates. As multivariate analytic approaches, data processing technologies and imaging techniques advance in the Big Data era of the 21^st century, it is anticipated that the challenges in complex imaging data processing in tumor classification will be overcome and complex data processing will be revolutionized. This will make accurate automatic tumor classification/segmentation (for each tumor type and grade) possible to early detect and treat tumors, guide treatment planning, monitor tumor progression and treatment effects, together with advanced accurate survival assessment to guide life-saving rescue and recovery planning, revolutionize patient care, and truly benefit children with brain tumors.

Keywords: multivariate analysis, pediatric brain tumor, tumor classification, survival assessment

DWI, diffusion-weighted imaging; MRS, mr spectroscopy; PWI, perfusion-weighted imaging; SVM, support vector machine; LDA, linear discriminant analysis; kNN, k-nearest neighbour; ANN, artificial neural networks; KLD, kullback-leibler divergence; mBm, multi-fractional brownian motion; EM, expectation maximization; PNET, primitive neuroectodermal tumor; PTPSA, piecewise-triangular-prism-surface-area; MD, mean diffusivity; FA, fractional anisotropy; PCA, principal component analysis; mRMR, maximum-relevance minimum redundancy; DNET, dysembryoplastic neuroepithelial Tumor; DNT, dysembryoplastic neuroepithelial tumor; ADC, apparent diffusion coefficient; rADC, ADC tumor-to-normal-brain ratios; ROC, receiver operating characteristic; AUC, area under the roc curve; ATCT, apparent transient coefficient in tumor; OS, overall survival; PFS, progression-free survival; GTR, gross total resection; RT, radiation therapy

Brain and central nervous system tumors are the 2^nd most common tumors in children.¹ Compared with brain tumors in adults, the histological types of childhood brain tumors have a larger variety such as medulloblastomas, pilocytic astrocytomas and ependymomas, which increases the difficulty in tumor diagnosis and differentiation. Since brain tumors in children are life-threatening, they deserve more research to improve patient care.

MRI (including T1, T2, FLAIR and contrast-enhanced MRI) is the most important imaging technique for the visualization and assessment of pediatric brain tumors. However, conventional MRI technique is often inadequate in grading tumors or identifying the aggressive region of the tumor. Therefore, advanced MRI techniques (modalities) such as DWI (Diffusion Weighted Imaging), MRS (MR Spectroscopy) and PWI (Perfusion Weighted Imaging) are needed to help with tumor assessment, and are increasingly used in brain tumor diagnosis. DWI is based on water diffusion and the apparent diffusion coefficient (ADC) map that DWI generates provides information on the conditions of different brain tissues.² MRS provides information on the specific molecules’ concentration in the brain tissue which reflects the biochemical characteristics in that brain region. PWI provides information on vascular perfusion (blood flow in capillaries and larger vessels) of the brain. These imaging techniques are useful in classifying (typing) and grading brain tumors.³

Multivariate analysis is a statistical approach that assesses multiple variables simultaneously, which may be more advantageous than univariate analysis in characterizing the associations between data variables (e.g., variables associated with outcomes), classifying data into different categories (e.g., tumor types and grades) and generating new diagnostic tests. Multivariate analysis includes a number of analytic methods such as multivariate regression, principal components analysis (PCA), independent components analysis (ICA) and cluster analysis. Machine learning (statistical learning) is a group of multivariate analytic methods often used in data classification, pattern recognition and data mining.² Machine learning can be either supervised or unsupervised. In supervised learning, the classifications of data samples in the training set (to train the classification model or classifier) are known; while in unsupervised learning, the classifications of data samples in the training set are unknown. Examples of supervised learning methods include linear discriminate analysis (LDA), support vector machine (SVM), artificial neural networks (ANN) and random forests. Examples of unsupervised learning include cluster analysis and Hebbian learning neural networks.

Since tumor classification and patient survival assessment often have much larger number of data variables than the sample sizes, they are called high-dimensional problems. Multivariate analysis is suitable for such problems. Imaging analytic studies of brain tumors in adults have been developed for decades, but imaging research of brain tumors in children is relatively young.² In recentyears, multivariate analysis has been increasingly used in tumor classification (or segmentation) and survival (or outcome) assessment in childhood brain tumors. This paper reviewed the studies that applied multivariate analysis to tumor classification (segmentation) and survival (outcome) assessment in pediatric brain tumors.

Pubmed search was performed with keywords “multivariate MRI pediatric brain tumor”, “machine learning imaging brain tumor childhood” or “multivariate pediatric brain tumor”. The search yielded 41 articles, and articles were excluded if their subjects were not pediatric or the statistical methods used were not multivariate or the article was published before year 2000.

27 articles were selected in this review. Among them, 13 articles reported multivariate analysis methods applied to pediatric brain tumor classification (or segmentation).^3‒15 Table 1 provides a summary of these papers. The rest of the articles reported multivariate analysis (using Cox model) applied to survival (or outcome) evaluation in pediatric brain tumors,^16‒29 which are summarized in Table 2.

Paper	N	Methods	Main Findings	Other Findings
Schneider et al, 2007 [4]	17 pts with posterior fossa tumors (7 medulloblastoma, 4 infiltrating glioma, 2 ependymoma, 4 pilocytic astrocytoma)	MRI, DWI, and MRS acquired and imaging parameters computed; linear discriminant analysis used as classifier	Combined ADC and metabolite ratio features (using water as an internal standard) could discriminate between the four tumor groups (positive predictive rate = 1), likelihood below 1 x 10(-9)	Metabolite ratio or ADC features alone could not discriminate the tumor groups
Reynolds et al, 2007 [5]	46 pts (16 astrocytoma grade I and II, 13 medulloblastoma, 3 ependymoma , 3 germinoma, 3 PNET, 2 astrocytoma grade III and IV, and 6 rare tumors)	1333 cases in WMRCTR database were used to produce probabilities of brain tumor class and construct classifiers (Bayesian belief networks); incorporate MRS information into the classifier	Overall misclassification rate: 32-37%; 4.8-25.6% for individual tumor class	Using the network to generate prior probabilities improves classification accuracy when compared with class prevalence- based prior probabilities
Wels et al, 2008 [6]	6 pts with tumors	Graph cut top-down segmentation method (with max-flow/min-cut optimization) for tumor segmentation; probabilistic boosting trees as classifier	Jaccard coefficient = 0.78±0.17	Automatic tumor segmentation; results comparable to those in adult patients; takes less time than manual segmentation
Davies et al, 2008 [7]	35 pts with cerebellar tumors (18 medulloblastomas, 12 pilocytic astrocytomas and 5 ependymomas)	Short-TE MRS acquired and metabolite profiles computed and verified; linear discriminant analysis used for variable selection and classification	Misclassification rate: 5.3% for glial-cell (astrocytoma + ependymoma) vs. non-glial-cell (medulloblastoma) tumors; 6.9% for astrocytoma vs. medulloblastoma; 7.1% for astrocytoma vs. medulloblastoma vs. ependymoma	Medulloblastomas characterised by high taurine, phosphocholine and glutamate and low glutamine; Astrocytomas distinguished by low creatine and high NAA; Ependymomas differentiated by high myo-inositol and glycerophosphocholine
Ahmed et al, 2011 [8]	10 pts with posterior fossa tumors (5 medulloblastomas, 5 astrocytomas)	Shape, intensity and (fractal dimension and multi-fractional Brownian motion-based) texture features extracted from T1, T2, and FLAIR images; Feature selection (PCA, boosting, KLD, entropy); Feature fusion (EM)	mBm-based texture features produced the best tumor segmentation accuracy for T1 and FLAIR images and for T1-T2-FLAIR fused images (100%), and intensity features best for T2 images; KLD measure for feature ranking and selection, and EM algorithm for feature fusion and tumor segmentation generated the best results; average Jaccard Index for tumor segmentation: ~0.6	Integrated KLD–EM framework is the best for tumor segmentation compared with other approaches (such as bottom up top down and graph cut)
Iftekharuddin et al, 2011 [9]	10 pts with posterior fossa tumors (5 medulloblastomas, 5 astrocytomas)	Shape, intensity and (fractal dimension and multi-fractional Brownian motion-based) texture features extracted from T1, T2, and FLAIR images; Improved AdaBoost classifier	True positive rate = 100%; False positive rate = 25%	Combined features improved classification accuracy
Weizman et al, 2011 [10]	Training set: 7 pts (28 scans) with optic pathway gliomas (OPG); Test set: 5 pts with OPG (25 scans)	Tumor segmentation based on prior tumor location tissue characteristics, and intensity information; classification of tumor into internal components; automatic volume measurements for tumor follow-up evaluation	A mean surface distance error of 0.73 mm and a mean volume overlap difference of 30.6%, with 25 min less segmentation time, compared with manual segmentation by radiologists	Automatic tumor segmentation; Can monitor tumor growth; Can be used to segment and classify other tumors such as intraventricular and posterior fossa tumors
Vicente et al, 2013 [11]	78 pts (29 medulloblastomas, 11 ependymomas; 38 pilocytic astrocytomas)	MRS acquired and metabolite concentrations computed; linear discriminant analysis and resampling used as classification	Discriminate the three tumor types, Balanced Accuracy Rate (BAR) = 0.98	For other tumor types (glial or primitive neuroectodermal): BAR = 1.00
Rodriguez Gutierrez et al, 2014 [12]	40 pts with posterior fossa tumors (17 medulloblastomas, 16 pilocytic astrocytomas, and 7 ependymomas)	Shape, histogram, and textural features extracted from T1, T2 and ADC images; single-feature SVM classifiers combined into multi-feature classifiers	ADC histogram features generated the best classification accuracy (95.8% of medulloblastomas, 96.9% of pilocytic astrocytomas, and 94.3% of ependymomas; ADC textural features produced the best tumor-subtype classification accuracy (89% of medulloblastoma subtypes); combined features: 91.4% of the joint posterior fossa tumors	Shape and textural features did not improve classification accuracy of ADC histogram feature-based classification performance
Tantisatirapong et al, 2014 [13]	74 pts (25 medulloblastomas, 34 pilocytic astrocytomas, and 15 ependymomas)	Data from CCLG database; Texture features extracted from T1, T2, FLAIR, ADC, MD and FA images; PCA, mRMR and feedforward methods used for feature selection; SVM classifier	Classification accuracy =69%	Texture features extracted from diffusion MR images are stronger for tumor classification than those from MRI
Fetit et al, 2015 [14]	48 pts (21 medulloblastoma, 20 pilocytic astrocytoma, 7 ependymoma)	2D and 3D texture features (grey-level co-occurrence matrix, etc.) extracted from MRI (T1, T2); Entropy-MDL discretization for feature selection; 6 classification methods used: Naïve Bayes, k-nearest neighbour (kNN), Classification tree, SVM, ANN, Logistic regression	Overall classification accuracy: 92% for SVM and ANN; 90% for logistic regression; 88% for naïve Bayes; 83% for classification tree and kNN (all with 3D texture features)	Compared with 2D texture features, 3D texture features improved diagnostic classification
Sabin et al, 2016 [15]	38 pts with histopathologically proven posterior fossa ependymoma	Cluster analysis of tumor location and morphological variables was performed to detect tumor multivariate patterns	Cluster analysis showed 2 tumor groups were distinguished based on tumor centroid location.	Such tumor classification was associated with prognostic and treatment factors
Koob et al, 2016 [3]	76 pts (17 pilocytic astrocytomas; 16 embryonal tumors; 11 DNET; 10 ependymomas; other)	Multimodal MRI (T1, T2, FLAIR, DWI, PWI, MRS) acquired and parameters (e.g., rADC) computed; Multivariate statistical analysis (using random forest classifier) performed to evaluate the diagnostic accuracy of MR modalities	The highest diagnostic accuracy for tumor grading: with DWI+PWI (73.24%); and for tumor typing: with DWI+PWI+MRS (55.76%).	ADC and rADC were the best parameters for tumor grading and typing; Multimodal MRI can be accurate in determining pediatric tumor grades (I and IV) and types (pilocytic astrocytomas and embryonal tumors)

Table 1 Summary of multivariate analysis approaches applied to pediatric brain tumor classification/segmentation

Pts: Patients; DWI: Diffusion-Weighted Imaging; MRS: MR Spectroscopy; PWI: Perfusion-Weighted Imaging; SVM: Support Vector Machine; ADC: Apparent Diffusion Coefficient; LDA: Linear Discriminant Analysis; KLD: Kullback-Leibler Divergence; mBm: Multi-Fractional Brownian Motion; EM: expectation Maximization; WMRCTR: West Midlands Regional Childhood Tumour Registry; PNET: Primitive Neuroectodermal Tumor; PTPSA: Piecewise-Triangular-Prism-Surface-Area; CCLG: Children’s Cancer and Leukemia Group; MD: Mean Diffusivity; FA: Fractional Anisotropy; PCA: Principal Component Analysis; mRMR: Maximum-Relevance Minimum Redundancy; DNET: Dysembryoplastic Neuroepithelial Tumor; kNN: k-Nearest Neighbour; ANN: Artificial Neural Networks; rADC: ADC Tumor-to-Normal-Brain Ratios.

Paper	N	Methods	Main Findings	Other Findings
Fernandez et al, 2003 [16]	80 pts with pilocytic astrocytomas (33 cerebellar, 18 optochiasmatic, 16 brainstem, 7 spinal cord, 3 thalamic, 2 optic nerve, 1 hemispheric)	Kaplan-Meier method used to estimate PFS and OS; Cox proportional hazards model used for survival analysis	5-year PFS rate was 75%, and the 5-year OS rates were 100 and 92% after total and partial removal.	Factors such as partial resection associated with bad prognosis.
Bucci et al, 2004 [17]	39 pts with nonbrainstem, malignant gliomas	Kaplan-Meier method used to estimate PFS and OS; Cox proportional hazards model used for survival analysis	The median PFS was 12.2 months, and the median OS was 21.3 months; 5-year OS and PFS rates of 35% and 26%	The extent of surgery and histologic grade were the strong predictors for outcomes
Dorward et al, 2010 [18]	40 pts with pilocytic astrocytoma who underwent gross-total resection (GTR)	Kaplan–Meier method used to estimate survival time; Cox proportional hazards regression model used for recurrence-free survival analysis	27.5% (11/40) of patients developed tumor recurrence after resection, with a mean time to recurrence of 16 months and a median of 6.4 months; Nodular enhancement on MR imaging at 3-6 months was significantly associated with recurrence	Postoperative surveillance MR imaging at 3-6 months after resection predicts tumor recurrence following GTR.
Grech-Sollars et al, 2012 [19]	58 pts with embryonal tumor (40 with medulloblastoma, 9 with atypical teratoid/rhabdoid tumors, and 9 with supratentorial primitive neuroectodermal tumors)	ACTC (a measure of the gradient change of ADC from the peri-tumoral edema into the tumor core) extracted from DWI images; Multivariate survival analysis used Cox proportional hazard regression	More negative ATCT values are significantly associated with a poorer survival (regardless of tumor type, extent of resection, age <3 years at diagnosis, and metastasis at presentation)	There is a significant difference for survival data regarding the change in ADC from edema into the tumor volume.
Youland et al, 2013 [20]	351 pts with low-grade gliomas (168 pilocytic, 88 astrocytoma; 47 oligodendroglioma; 24 subependymal giant cell; 20 mixed oligoastrocytoma; 4 other/unknown)	Kaplan-Meier method used to estimate PFS and OS; Cox proportional hazards model used for survival analysis	10-year PFS was 62% and OS was 90%; Improved PFS was associated with GTR and postoperative RT; higher OS was associated with GTR and pilocytic histology	GTR was associated with improved OS and PFS; RT associated with improved PFS
Sun et al, 2013 [21]	33 pts with brainstem gliomas (8 astrocytoma, 3 oligodendroglioma, 2 oligoastrocytoma, 1 glioblastoma, other)	Kaplan–Meier method used to estimate survival time; Cox proportional hazards regression model used for survival analysis	Overall median survival 11 months with 1-year survival rate 43.6%; Multivariate analysis showed that diagnostic latency (<2 months) and tumor focality were associated with longer survival	Diffuse pattern of tumor was markedly
Sun et al, 2013 [21]				associated with a shorter survival
Sun et al, 2014 [22]	102 pts with choroid plexus carcinoma	Kaplan-Meier and multivariate Cox regression survival analyses performed to determine the effect of GTR	Multivariate analysis showed that GTR increased OS	GTR improved PFS on Kaplan-Meier analysis, but not significant in multivariate analysis
Felix et al, 2014 [23]	19 pts with diffuse intrinsic pontine gliomas	Kaplan–Meier method used to estimate survival time; Cox proportional hazards regression model used for survival analysis	Median OS of patients in treated vs. non-treated group: 13.4 vs. 7.8 months. Median event-free survival of patients in the treated vs. non-treated group: 9.5 vs. 6.5 months	Significant longer survival in the treated group than non-treated group
Jansen al, 2015 [24]	316 pts with diffuse intrinsic pontine glioma	Cox proportional hazards with backward regression was used to select prognostic variables for survival prediction; bootstrapping was used to validate the model	Median overall survival was 10 months; AUC=0.68	Positive predictors: Age ≤3 years, longer symptom duration at diagnosis, and use of oral and intravenous chemotherapy; Negative predictor: ring enhancement on MRI
Felicetti et al, 2015 [25]	90 pts with tumor (44 non-Hodgkin lymphoma; 19 medullo-blastoma; 7 ependymoma; 6 astrocytoma; 4 germinoma; 10 others) who underwent radiotherapy	Cox multivariable analysis was performed to identify potential risk factors	The occurrence of meningioma was associated with the development of other second neoplasms	Age, sex, or CRT dose had no influence on the occurrence of meningioma
Morana et al, 2015 [26]	21 pts with glioma (9 diffuse astrocytoma; 5 anaplastic astrocytoma; 5 glioblastoma; 1 glioneuronal tumor; other)	(18)F-DOPA PET and MRS data were compared and correlated; Cox multivariable analysis was performed for survival analysis	Diagnostic accuracy: (18)F-DOPA PET: 78%; MRS: 93%; (18)F-DOPA uptake correlated with PFS and OS	Significant differences of (18)F-DOPA uptake and MRS ratios were found between low-grade and high-grade gliomas
Gunther et al, 2015 [27]	72 pts with ependymoma	Multivariate analysis was used to determine RT effect; Cox multivariable analysis was performed for survival analysis	Proton beam RT (compared to intensity modulated RT) was associated with more frequent imaging changes	Postradiation MRI changes are more common with proton beam RT and in patients less than 3 years of age at diagnosis and treatment
Li et al, 2016 [28]	79 pts with meningiomas	Univariate and multivariate analyses used to examine the association between high-grade meningiomas and imaging features	An unclear tumor-brain interface, lateral location, and narrow base were predictive factors for high-grade meningiomas	Pediatric meningiomas have specific imaging features on MRI indicating their malignancy.
Steinbok et al, 2016 [29]	72 pts with thalamic tumors	Kaplan–Meier method used to estimate survival time; Cox proportional hazards regression model used for survival analysis	5-year overall survival was 61 ± 13% for unithalamic tumors compared to 37 ± 32% for bithalamic tumors; Highgrade tumors had a much lower 5-year OS (7±13 %) than low-grade tumors (84±17 %);	Unilateral tumors were mainly low grade vs. bithalamic tumors (high-grade). Multivariate analysis indicated tumor grade as the only significant prognostic factor for unithalamic tumors

Table 2 Summary of multivariate analysis approaches applied to survival/outcome assessment in pediatric brain tumors

Pts: Patients; AUC: Area Under the ROC (Receiver Operating Characteristic) Curve; ATCT: Apparent Transient Coefficient in Tumor; OS: Overall Survival; PFS: Progression-Free Survival; GTR: Gross Total Resection; RT: Radiation Therapy; DNT: Dysembryoplastic Neuroepithelial Tumor.

Multivariate analysis in pediatric brain tumor classification or segmentation

In the early years, multivariate methods such as LDA and belief networks were used in tumor classification (Table 1). Schneider et al. computed imaging parameters (ADC and metabolite ratio features from MRI, DWI and MRS), and used LDA as classifier to discriminate between four tumor groups (medulloblastoma, infiltrating glioma, ependymoma, and pilocytic astrocytoma).⁴ They found that combined ADC and metabolite ratio features (using water as an internal standard) could discriminate between these tumor groups.⁴ Reynolds et al. used tumor location information obtained from MRI to construct Bayesian belief networks, incorporated MRS information into the classifiers, and found that inclusion of a priori knowledge improved classification accuracy.⁵ Later, Davies et al. extracted multiple metabolite profiles from (short-TE PRESS) MRS, classified tumors (e.g., glial-cell vs. non-glial-cell tumors) with LDA and achieved high classification accuracies (>90%).⁷ Further, in a multi-national study (n=78), Vicente et al. computed metabolite concentrations from (short-TE PRESS) MRS data, classified tumors (e.g., medulloblastomas vs. ependymomas vs. pilocytic astrocytomas) with LDA, and achieved high classification accuracies (98%~100%).¹¹

Tumor segmentation allows quantitative (e.g., tumor volume) analysis, and automatic tumor segmentation was achieved by applying graph cut method to tumor image and using probabilistic boosting trees as classifier.⁶ In addition, Ahmed et al.⁸ and Iftekharuddin et al.,⁹ used shape, intensity and texture features extracted from MRI images for tumor segmentation and classification (sensitivity 100%, specificity 75%).⁹ For special brain tumors such as optic pathway gliomas, automatic tumor segmentation was achieved by using prior tumor location tissue characteristics and intensity information, and by classifying tumor into internal components.¹⁰

In recentyears, multivariate analysis methods such as SVM, cluster analysis and random forest have been applied to tumor classification (typing) and tumor grading. Using SVM classifier, Rodriguez Gutierrez et al. demonstrated that ADC histogram features generated high classification accuracy (95.8% of medulloblastomas, 96.9% of pilocytic astrocytomas, and 94.3% of ependymomas); and ADC textural features produced high tumor-subtype classification accuracy (89% of medulloblastoma subtypes).¹² With a large sample (n=74), Tantisatirapong et al. investigated two feature selection approaches (principal component analysis PCA, and the combination of max-relevance and min-redundancy (mRMR) and feedforward selection), and tumor classification with SVM classifier based on texture features (from MRI, DWI and DTI) yielded varied classification performance.¹³ In addition, Fetit et al. examined the classification performance of 6 classifiers with 2D and 3D texture features extracted MRI (T1, T2) and found that SVM and ANN had the highest overall classification accuracy (92%), and 3D texture features improved classification performance.¹⁴ In patients with pediatric posterior fossa ependymoma, tumor classification with cluster analysis revealed that tumor subgroups could be distinguished based on tumor centroid location, which was associated with prognostic and treatment factors.¹⁵ Further, in a recent study (n=76), Koob et al. classified tumors with random forest classifier, and found diagnostic accuracy 73.24% with DWI+PWI for tumor grading, and 55.76% with DWI+PWI+MRS for tumor typing.³ They also found that the best parameters for tumor grading and typing were ADC and rADC (ADC tumor-to-brain ratios), and multimodal MRI (MRI, DWI, PWI, MRS) could determine pediatric tumor grades (I, IV) and types (pilocytic astrocytomas and embryonal tumors).³

However, there are large variations in the tumor classification results among the studies (even in similar patient samples), e.g.,¹² vs.¹³;³ vs.¹¹;³ vs.¹²

Multivariate analysis in survival/outcome assessment in pediatric brain tumors

The majority of the studies listed in Table 2 used Kaplan-Meier method to estimate progression-free survival (PFS) and overall survival (OS), and applied multivariate Cox proportional hazards model to identification of prognostic factors. In patients with pilocytic astrocytomas (n=80), 5year PFS rate was 75%, and the 5year OS rates were 92-100%.¹⁶In patients with low-grade gliomas (n=351), 10year PFS was 62% and OS was 90%.²⁰ However, in patients with malignant non-brainstem gliomas (n=39), 5year OS and PFS rates were 35% and 26% respectively, and the median PFS and OS were 12.2months and 21.3months respectively.¹⁷ In addition, in patients with brainstem gliomas (n=33), overall median survival was 11months with 1year survival rate 43.6%.²¹ Further, diffuse intrinsic pontine glioma has the worst prognosis. The median overall survival was 10months in patients with diffuse intrinsic pontine glioma (n=316) and in this large patient sample, the area under the ROC (Receiver Operating Characteristic) curve (AUC) for the final multivariate survival model was 0.68.²⁴

A number of prognostic factors such as tumor characteristics (type, grade, location, etc.), imaging features and treatment are associated with patients’ survival.^{17,19,20,23‒26} The AUC (0.68) of the final multivariate survival model for a large sample of patient with diffuse intrinsic pontine glioma (n>300)²⁴ indicates that the survival estimate and prognostic factor identification made by the multivariate survival model are far from perfect, and error may exist in the survival results.

Brain tumors in children are heterogeneous, and have imaging and histological features that are quite different from adults. Posterior fossa tumors are the most studied pediatric brain tumors. Multivariate approaches are powerful analytic tools suitable for high dimensional problems, and have been increasingly used in pediatric brain tumor classification (or segmentation) and patient survival (outcome) assessment.

Pediatric brain tumor classification/segmentation with multivariate analysis

The large variations in the classification results among the imaging studies (even in similar samples) may be explained by viewing the process of brain tumor classification as an image processing chain, and the final classification result (classification accuracy) as the imaging diagnostic accuracy. A number of factors along the tumor classification processing chain have impacts on the final result.

First, image selection for analysis: The most commonly used imaging data for tumor classification is MRI (T1, T2, FLAIR); DWI and MRS are often used; and PWI is sometimes used. Although the diagnostic values of advanced imaging modalities such as PWI and the optimal MR imaging protocol for tumor classification are still to be determined, studies have shown that metabolite ratios (from MRS) or ADC features (from DWI) alone could not discriminate the tumor groups, but combined ADC and metabolite ratio features could;⁴ further, the combination of DWI, MRS and PWI was good for tumor classification (accuracy 55.76 %), and the combination of DWI and PWI was good for tumor grade evaluation (accuracy 73.24%).³ More studies are needed to examine the diagnostic values of advanced MRI imaging modalities for each tumor type and grade, and explore the optimal MR imaging protocol for classification of pediatric brain tumors.

Second, feature extraction: Some imaging features such as shape, histogram, tumor location, ADC, metabolite concentration ratio and CBV (cerebral blood volume) can be obtained by straightforward standard computation. However, some imaging features such as texture features are not straightforward nor standard. They can be 2D or 3D, based on grey-level co-occurrence matrix or grey-level run-length matrix,¹⁴ or fractal dimension or multi-fractional Brownian motion.^8,9 Study has shown that computed with 2D texture features, 3D texture features improved tumor classification performance.¹⁴ In addition, it has been shown that multi-fractional Brownian motion-based texture features produced high tumor segmentation accuracy for T1 and FLAIR images and for T1-T2-FLAIR fused images (100%), and intensity features best for T2 images.⁸ Therefore, classification performance can be improved by standardizing (e.g., 3D texture features) and optimizing feature extraction.

Third, feature selection: Studies have investigated various feature selection approaches including: PCA, boosting, KLD (Kullback-Leibler divergence),⁸ entropy (entropy-MDL discretization);¹⁴ mRMR (maximum-relevance minimum redundancy) and feed forward methods,⁸ and expectation maximization for feature fusion.⁸ Since feature selection methods determine which features to be used for classification analysis, the quality of these methods directly affects classification result. Further research is needed to optimize and standardize feature selection to improve tumor classification.

Fourth, classification: A variety of multivariate classifiers such as LDA, SVM, ANN, cluster analysis and random forest have been applied to tumor classification. The classification performances of 6 classifiers (naïve Bayes, k-nearest neighbor (kNN), classification tree, SVM, ANN, logistic regression) have been examined in a recent study which reported the tumor classification results, i.e., the overall classification accuracy of these classifiers (using 3D texture features): 92% for SVM and ANN; 90% for logistic regression; 88% for naïve Bayes; 83% for classification tree and kNN.¹⁴ The large differences in classification performances (e.g., 83% vs. 92%) indicates the importance of selecting the best classifier for the data (for patients with specific tumor types and grades), standardizing classifier for each tumor type and grade, and optimizing the classifier for the best classification performance.

Taken together, variations in each step along the imaging processing chain of tumor classification could accumulate and cause the large variations in final classification results among the studies. Therefore, optimizing and standardizing the imaging processing chain of tumor classification may reduce the large variations of classification results among the studies, improve classification performance, and enhance the diagnostic values of MRI imaging modalities.

Survival/outcome assessment with multivariate analysis in pediatric brain tumors

Compared with tumor classification studies, the survival assessment studies have larger sample sizes (2 studies have sample sizes >300) and there are less variations in the survival results among the studies. However, as indicated in the AUC (0.68) of the final multivariate survival model,²⁴ the error rate in the multivariate survival analysis is un-neglectable. The process of survival assessment in pediatric brain tumors can also be viewed as a data processing chain, and several factors along the chain have impacts on the final results.

First, data selection for analysis: A wide range of data can be used for patients’ survival assessment (patient’s age, gender, symptoms, tumor characteristics, diagnostic latency, treatments, etc.). Some data needs to be extracted or computed from raw data. For example, imaging features such as ACTC (Apparent transient coefficient in tumor, a measure of the gradient change of ADC from the peri-tumoral edema into the tumor core) extracted from DWI are associated with survival.¹⁹ Since each data contributes to the final survival results differently (i.e., each has different weight) and some data are of low quality, data needs to be screened and selected (e.g., according to their weights) for survival analysis. Different data selection affects survival estimate. Thus, optimizing and standardizing data selection may reduce errors and improve survival analysis.

Second, statistical analysis: Survival assessment is a standard approach:

Use Kaplan-Meier method to estimate progression-free survival (PFS) and overall survival (OS), and compute differences in OS between different groups (tumor groups or grades) with univariate analysis through Kaplan-Meier survival plots;
Use multivariate Cox proportional hazards model to evaluate the significance of associated data variables (prognostic factors that are associated with survival) indicated by the univariate analysis.

However, the methods that select data variables or prognostic factors may vary among studies (either from univariate analysis to multivariate analysis, or in refining the multivariate analysis model). Bootstrapping is needed to validate the final multivariate model so that the results of survival analysis may be more accurate and the final model may be generalizable.²⁴Taken together, optimizing and standardizing the data processing chain in survival assessment may reduce errors, improve survival estimation, and enhance the identification of prognostic factors in survival analysis.

A method comparison between multivariate analyses in tumor classification and survival assessment is presented in Table 3. Compared with relatively complex imaging data and their processing in tumor classification, the relatively simple data and their processing in survival assessment make it possible to perform multivariate survival analysis on large samples (e.g., n>300), which makes the results of survival analysis (based on large samples) more stable and reliable.

	Tumor Classification	Survival Assessment
Sample sizes*	>=6 and <=78	>=19 and <=351
Data for analysis	MRI (T1, T2, FLAIR) and/or DWI, MRS, PWI, etc.	Age, gender, symptom(s) duration, tumor characteristics (location, focality, type, grade, size, shape, imaging intensity, whether ring enhancement on MRI, whether unclear tumor-brain interface, whether development of other second neoplasms, whether abnormality on other imaging such as DWI, MRS and PET, recurrence, etc.), diagnostic latency, treatment (whether use of chemotherapy, whether use of radiotherapy, dose, whether surgical treatment, extent of resection, histology, etc.)
Feature extraction	Features with standard computation: shape, intensity, histogram, tumor location, ADC for DWI, metabolite concentrations or ratios for MRS, CBV (cerebral blood volume) for PWI, etc.; Features with varied computation: texture features (2D vs. 3D, grey-level co-occurrence matrix-based vs. fractal dimension and multi-fractional Brownian motion-based)	Extract data such as imaging features (if necessary), e.g., ACTC (a measure of the gradient change of ADC from the peri-tumoral edema into the tumor core) (Grech-Sollars et al, 2012 [19])
Feature selection (Data selection from preliminary analysis)	Use feature selection (or reduction) method: PCA, boosting, KLD (Kullback-Leibler divergence), entropy (entropy-MDL discretization); or mRMR (maximum-relevance minimum redundancy) and feedforward methods; expectation maximization for feature fusion, etc.	Use Kaplan-Meier method to estimate progression-free survival (PFS) and overall survival (OS); Compute differences in OS between different groups (tumor groups or grades) with univariate analysis through Kaplan-Meier survival plots; Select significant prognostic factors in univariate analysis
Classification/ Multivariate analysis	Use classification method: LDA, SVM, ANN, cluster analysis, random forest, etc.	Use multivariate Cox proportional hazards model to evaluate the significance of associated data variables (prognostic factors that are associated with survival) indicated by the univariate analysis
Model validation	Cross-validation, etc.	Bootstrapping, etc.

Table 3 Method comparison of multivariate analyses in tumor classification vs. survival assessment

*Sample sizes: Based on the literature reviewed; ATCT: Apparent Transient Coefficient in Tumor; OS: Overall Survival; PFS: Progression-Free Survival.

In summary, this paper reviewed the studies that applied multivariate analysis to tumor classification (segmentation) and survival (outcome) assessment in pediatric brain tumors. The large variations in the classification results among the tumor classification studies (even in similar patient populations) may be reduced by optimizing and standardizing image processing chains of tumor classification. Similarly, optimizing and standardizing data processing chains of survival assessment may improve survival analysis and reduce errors in survival estimates and identification of prognostic factors.

As multivariate analytic approaches, data processing technologies and imaging techniques advance in the Big Data era of the 21^st century, it is anticipated that the obstacles in the complex imaging data processing in tumor classification will be overcome and complex data processing will be revolutionized. This will make accurate automatic tumor classification/segmentation (for each tumor type and grade) possible to early detect and treat tumors, guide treatment planning, monitor tumor progression and treatment effects, together with advanced survival assessment to guide life-saving rescue and recovery planning, revolutionize patient care, and truly benefit children with brain tumors.

None.

Author declares that there is no conflict of interest.

Submit manuscript...

International Journal of

eISSN: 2574-8084

Radiology & Radiation Therapy

Multivariate analysis in pediatric brain tumor

Jing Zhang

Verify Captcha

Regret for the inconvenience: we are taking measures to prevent fraudulent form submissions by extractors and page crawlers. Please type the correct Captcha word to see email ID.

Abstract

Abbreviations

Introduction

Methods

Results

Discussion

Conclusion

Acknowledgements

Conflict of interest

References

Citations

Journal Menu

Useful Links