Research Article Volume 11 Issue 1
1Hematology Unit, Oncology Center, Faculty of Medicine, Mansoura University, Egypt
2Department of Biochemistry, Faculty of Science, Mansoura University, Egypt
3Department of Information Systems, Faculty of Computers and Information, Mansoura University, Egypt
4Department of Pharmacognosy, Faculty of Pharmacy, Mansoura University, Egypt
Correspondence: Mohamed Mabed, Hematology Unit, Oncology Center, Faculty of Medicine, Mansoura University, Egypt
Received: September 27, 2022 | Published: January 20, 2023
Citation: Mabed M, Saafan M, Ahmed MH, et al. The Potential utility of fourier transform infrared spectroscopy for the diagnosis and the prognosis of nonhodgkin lymphoma. Hematol Transfus Int. 2023;11(1):1-8. DOI: 10.15406/htij.2023.11.00292
Background: Fourier Transform Infrared (FTIR) can detect slight change in the biochemical global composition in samples taken from patients with Non-Hodgkin’s lymphomas (NHL) including changes in carbohydrates, proteins, nucleic acid, lipids or even water. These small changes might serve early detection before any morphological changes occur.
Patients and methods: The study included samples from (30) patients with diffuse large B-cell lymphomas (DLBCL) and samples from (33) healthy controls. The serum samples were analyzed using FTIR spectroscopy and the resulted data were analyzed using support vector machine (SVM) learning.
Results: Ranges 1240-1190 cm-1 and 1140-1000 cm-1 had the best true positive, false negative and accuracy results. True negative, false positive, specificity and precision were best seen in Range 1580-1480 cm-1 and sensitivity was best seen in Range 3500-2800 cm-1. Ranges 1240-1190 cm-1 and 1140-1000 cm-1 represent frequencies that are assigned to bonds found in nucleic acids.
Conclusion: FTIR spectra ranges that could be recommended to be used as diagnostic and prognostic tool for DLBCL patients are 3500-2800,1580-1480 and 1240-1000 cm-1. The finding that the most important changes occurs in the nucleic acids may has therapeutic implication.
Keywords: diagnosis, fourier transform infrared, machine learning analysis, non-hodgkin lymphoma, prognosis, support vector machine
Lymphoma is a solid tumor of immune system that develop from lymphocytes.1,2 Non-Hodgkin’s lymphomas (NHL) are a group of cancers that most of them arise from B lymphocytes and some from NK- or T-lymphocytes. The tumors are usually in lymph nodes, however, NHL can occur in any tissue with different degrees of aggressiveness such as indolent follicular lymphoma, or aggressive Burkett’s lymphomas and diffuse large B-cell (DLBCL).1 DLBCL type of NHL represents 30% of NHL cases.3,4 DLBCL can occur in any tissue organ with the gastrointestinal track as the most common involved organ.5 Computerized tomography (CT) scan and magnetic resonance imaging (MRI) are the standard methods for the detection and the assessment of NHL tumors with the increasing use of positron emission tomography (PET) scan with 2-[18F] fluoro-2-deoxy-D-glucose (FDG) in the management of the disease.1 Despite the use of different staging systems such as Ann Arbor staging system and the International Prognostic Index (IPI) as prognostic model, the primary assessment of NHL patients is done by the physician building on the CT scan and MRI.1,3
Most techniques used for the detection of cancer are not accurate, usually detect the disease at late stages, and depend on the estimation of the physician.6 The absence of enough reliable methods for early detection of cancers requires a search for new and more effective techniques for screening and prevention. Finding of an appropriate technique to test risk groups would increase the chances of successful treatment and subsequently reduce mortality. Accurate diagnostic, minimally invasive, rapid, sensitive and cheap procedures for early detection of caner are a main requirement to reduce the mortality rates caused by the disease.7 The characterization of cancer on the genomic level is promising, however, the techniques are faced with difficulties such as the tumor extended heterogeneity and the techniques expensive cost.8 One of the new techniques that might be promising for the early and effective detection of cancer tumor is Fourier Transform Infrared (FTIR).
FTIR can detect the slight changes in the biochemical global composite on of samples obtained from the patients including changes in carbohydrates, proteins, nucleic acid, lipids or even water. These small changes could be detected before any morphological changes occur.9–11 Using FTIR as imaging tool could be a revolutionized technique for rapid and qualitative analysis of bio fluids, cells and tissue in the clinical routine with no reagents and cheap advantages.6 FTIR instruments cover a range of spectra from 4000-400 cm-1. These frequencies correspond to specific bonds in the analyzed compound or samples. The spectrum resulted from the bond vibration frequencies evidences the presence of different functional groups and chemical bonds in the sample. FTIR is useful for the identification of organic molecular compounds and groups due to the range of cross-links, side chains and functional groups, all of which have characteristic vibrational frequencies in the infra-red range.12
Classification turn out to be a specific task of machine learning that can be used commonly in the medical diagnosis. It can expect the class label for unobserved instances. In classification, objects are classified relied on the extracted features. For example, the object could be normal or abnormal, benign or malignant. Classification can be a single or multi labeled. The former comprises of binary and multi-class classification. If each instance is linked to an unrivaled and only one class label, that is named a single classification. If a number of these class labels equivalents to two, at that time the classification is well-defined as binary. On the other hand, if the class labels are extra than two then the classification is labeled as multi-class. In binary classification, the prediction is 0 or 1 (positive or negative, pathology or normal).13
We herein, investigate the potential use of this technique as a developing clinical tool for the identification and correlation with prognosis in patients with Non-Hodgkin lymphoma.
It is an observational study in which FTIR spectroscopy was used to test the serum of DLBCL type of NHL patients and the serum of healthy persons. In order to identify the patients' serum and healthy persons' serum, the absorbance (wave numbers) ranges were measured. Also, FTIR spectroscopy was used to test serum samples of patients after receiving treatment to determine the efficiency of the therapy. The study approved by the Institutional Review Board at Mansoura Faculty of Medicine, Mansoura, Egypt. Written informed consents were obtained from each patient. The study included samples from (30) patients with Non-Hodgkin lymphoma (DLBCL type) and samples from (33) healthy controls. The study carried out June 2019 to July, 2020.
Study population
Inclusion criteria
Exclusion criteria
Investigational plan: The following investigations were performed. The patients were on follow up in the outpatient clinic of Oncology Center, Mansoura University, Egypt.
Sampling: Five mL of whole blood was collected in a sterile empty tube following standard procedures. The samples were allowed to clot in a standing position for about 20-30 minutes followed by centrifugation for 10 minutes at approximately 1000g. Using clean pipette, aliquots of serum was collected in a plastic screw-cap vials and stored at -80°C till the FTIR analysis.
Assessment of attenuated total reflectance (ATR)-FTIR spectroscopy: FTIR spectroscopy mode was estimated using a Thermo-Nicolet 6700 FTIR spectrophotometer provided with ATR and with full integration with the OMNIC software (USA). Serum samples were lyophilized then directly examined with standard procedures of ATR-FTIR. The FTIR spectra was obtained in the range from 400 to 4000 cm-1 at room temperature and a resolution of 4 cm-1 with the co-addition of 32 scans for sample spectra and 128 scans for the background spectra. The peaks were plotted as the wave number (cm-1) on the Y-axis and transmittance percentage on the X-axis.
Analysis of FT-infrared results: In our binary classification, there are two major phases, training and testing. The classifier was trained using the training set putting into account the cross validation such as 10-fold or 5-fold. On the other hand, the confusion matrix was computed to evaluate classifier due to accuracy, specificity and sensitivity.
The output of the classifier or the classification technique is the classification model which can be used to predict the class labels in testing phase. Support vector machine (SVM), Knearest neighbors (KNN), and artificial neural network (ANN) and decision tree are different types of classification techniques.14 In this work, SVM was used.
From Figure 1, the image on the top left reveals three classes of data (1, 2 and 3). They used Multi SVM (MSVM) with the methodology of OVR. From (b) to (d), the classifier works with every class against the rest. It considered every class is positive, while the rest were negative.
The proposed framework
Balanced dataset was used which means that the number of normal cases is nearly equal to the number of pathologies that are previously known and used in training the classifier. The entire dataset is relational. It consists of attributes for each instance or case (Figure 2).
The classifier was trained with the training set data and validated the accuracy, sensitivity and specificity by using confusion matrix and Receiver Operating Characteristic (ROC) curve will be illustrated below. Finally, the established classification model was used in in predicting the testing dataset to predict if the cases responded to the used treatment plan and drugs or not. The clinical data and the physician observations were used to validate and make the comparisons for the results in order to evaluate the established classification model. The dataset that resulting of the used pathology cases in training was tested but after treatment and other different dataset after treatment that have no training data. The dataset will be illustrated in detail in the next section. The confusion matrix formulas as illustrated in Figure 3.
Three important meanings in the confusion matrix should be defined: sensitivity, specificity and accuracy. Sensitivity or true positive rate (TPR) is defined as probability that a test result will be positive when the disease is present. Specificity or true negative rate (TNR) is a probability that a test result will be negative when the disease is not present. Both sensitivity and specificity are expressed as percentages. Accuracy is the true predicted results while the disease is present or absent (true positive rate and true negative rate). These parameters are calculated according to the following formulas.
The ROC curve is a fundamental tool for diagnostic test evaluation. In a ROC curve the TPR (sensitivity) is plotted in function of the FPR (100-Specificity) for different cut-off points of a parameter. Each point on the ROC curve represents a sensitivity/specificity pair corresponding to a particular decision threshold. The area under the ROC curve (AUC) is a measure of how well a parameter can distinguish between two diagnostic groups (diseased/normal). A test with perfect discrimination (no overlap in the two distributions, 100% sensitivity, 100% specificity) has a ROC curve that passes through the upper left corner (Figure 4). Therefore, the closer the ROC curve is to the upper left corner, the higher the overall accuracy of the test.
The descriptive data for NHL patients at diagnosis
The descriptive data and laboratory data of NHL patients at diagnosis are summarized in Table 1. In this study 30 patients were examined at Diagnosis, twenty-two males (73.3%) and eight females (26.7%), mean age at diagnosis was 54.6±12.3 years. Regarding laboratory data of NHL patients at diagnosis, hemoglobin median was 12±2.6 g/dL, white blood cells (WBCs) count median was 6×109 cell/liter (ranges from 1.4 to 14 ×109 cell/liter), platelets median was 165×109 cell/liter (ranges from 2 to 549×109 cell/liter), Lymphocytes median was 1.7×109 cell/liter (ranges from 0.3 to 8.9×109 cell/liter). ESR 1st hour median was 20 mm (ranges from 5 to 60 mm), alanine aminotransferase (ALT) median was 21 U/L (ranges from 5 to 109 U/L), aspartate transaminase (AST) median was 30.5 U/L (ranges 8 to 83 U/L), total bilirubin median was 0.8 mg/dL (ranges from 0.2 to 2 mg/dL), direct bilirubin median was 0.3 mg/dL (ranges from 0.1 to 0.7 mg/dL), serum uric acid median was 5 mg/dL (ranges from 1.2 to 21 mg/dL), serum creatinine median was 1.1 mg/dL (ranges from 0.6 to 3 mg/dL), serum albumin was 3.3±0.8 g/dL. Regarding viral markers of patient of NHL patients at diagnosis, HIV infection was zero, HCV infected patients were twenty (66.7%), HBV infected patients were one (3.3%).
Parameters |
Number of patients i |
Percentage of patients i |
|
Gender |
Male |
22 |
73.3 |
Female |
8 |
26.7 |
|
HIV infection |
0 |
0 |
|
HCV infection |
20 |
66.7 |
|
HBV infection |
1 |
3.3 |
|
|
Median |
Range |
|
WBCs (×109 cell/liter) |
6 |
1.4-14.0 |
|
Lymphocytes (×109 cell/liter)ii |
1.7 |
0.3-8.9 |
|
Platelets (×109 cell/liter) |
165 |
2.0-549.0 |
|
ESR 1st hour (mm)iii |
20 |
5.0-60.0 |
|
ALT (U/L) |
21 |
5.0-109.0 |
|
AST (U/L) |
30.5 |
8.0-83.0 |
|
Total bilirubin (mg/dL) |
0.8 |
0.2-2.0 |
|
Direct bilirubin (mg/dL)iv |
0.3 |
0.1-0.7 |
|
Uric acid (mg/dL) |
5 |
1.2-21.0 |
|
Serum. creatinine (mg/dL) |
1.1 |
0.6-3.0 |
|
|
Median |
Standard deviation (SD) |
|
Age at diagnosis (years) |
54.6 |
12.3 |
|
Hemoglobin (g/dL) |
12 |
2.6 |
|
Serum albumin (g/dL)v |
3.3 |
0.8 |
Table 1 Demographic and Laboratory data of NHL patients at Diagnosis
iTotal number of patients is 30.
iiTotal number of patients is 29.
iiiTotal number of patients is 21.
ivTotal number of patients is 16.
vTotal number of patients is 18.
At Diagnosis, elevated LDH was detected in 18 patients (16%), nine patients with age > 60 years (31.0%), stage III or IV was diagnosed in 25 patients (83.3%), performance status was > 1 in 18 patients (10.6%), and more than one extra-nodal site was found in 11 patients (36.7%) as shown in Table 2. International Prognostic Index (IPI) score, low risk patients were 7 (23.3%), low intermediate risk patients were 7 (23.3%), high intermediate risk patients were 13 (43.3%), high risk patients were 3 (10.0%) as shown in Table 2.
Parameters |
Number of patients i |
Percentage of patients (%) i |
|
Elevated LDH |
18 |
60 |
|
Age> 60 years (n=29) |
9 |
31 |
|
Stage III or IV |
25 |
83.3 |
|
Performance status >1 |
18 |
10.6 |
|
More than one extra-nodal site |
11 |
36.7 |
|
Low risk |
7 |
23.3 |
|
IPL score |
Low intermediate risk |
7 |
23.3 |
High intermediate risk |
13 |
43.3 |
|
High risk |
3 |
10 |
Table 2 Risk score of NHL patients at Diagnosis
iTotal number of patients is 30.
Staging results of NHL patients were as the following, stage Is in one patient (3.3%), stage II in two patients (6.7%), stage IIE in two patients (6.7%), stage III in six patients (20.0%), stage IIIs in three patients (10%), stage IV in 16 patients (53.3%). At diagnosis, examination revealed lymphadenopathy in twenty-seven patients (90.0%), splenomegaly in seventeen patients (56.7%), hepatomegaly in twelve patients (40.0%). By history twelve patients were suffering from B symptoms. There data are summarized in Table 3.
Parameters |
Number of patients i |
Percentage of patients (%) i |
|
|
I S |
1 |
3.3 |
II |
2 |
6.7 |
|
II E |
2 |
6.7 |
|
III |
6 |
20 |
|
III S |
3 |
10 |
|
IV |
16 |
53.3 |
|
B symptoms (n=29) |
12 |
41.4 |
|
Lymphadenopathy |
27 |
90 |
|
Splenomegaly |
17 |
56.7 |
|
Hepatomegaly |
12 |
40 |
Table 3 Clinical data (Ann arbor staging and organomegaly) of NHL patients at Diagnosis
iTotal number of patients is 30.
Patients received 4 cycles chemotherapy before re-evaluation of response, three patients received cyclophosphamide, vincristine, and prednisone (COP) (10.3%), eleven patients received rituximab (Rituxan), cyclophosphamide, doxorubicin hydrochloride, vincristine
(Oncovin) and prednisolone (standard R-CHOP) regimen (36.0%), three patients received rituximab and reduced dose CHOP (R-mini-CHOP) (10.3%), twelve patients received dose adjusted etoposide, vincristine, doxorubicin, cyclophosphamide and oral prednisone (DAEPOCH) (41.4%) as shown in Table 4.
After 4 cycles chemotherapy, patient’s response to treatment was evaluated. Thirteen patients achieved complete response (48.2%), six patients achieved partial response (22.2%), seven patients showed no response or stable disease (25.9%), two patients showed progression of the disease (7.4%) as shown in Table 4.
Parameters |
Number of patients |
Percentage of patients (%) |
|
Type of treatmenti |
COP |
3 |
10.3 |
R. CHOP |
11 |
36 |
|
R. mini CHOP |
3 |
10.3 |
|
DA EPOCH |
12 |
41.4 |
|
Re-evaluation of responseii |
Complete response |
13 |
48.2 |
Partial response |
6 |
22.2 |
|
Progressive |
2 |
7.4 |
|
No response or stable disease |
7 |
25.9 |
Table 4 Treatment type and response assessment
iTotal number of patients is 29.
iiTotal number of patients is 28.
The descriptive data for NHL patients’ post-therapy
Patient achieve complete response: Re-evaluation of patients’ condition after receiving chemotherapy revealed that 13 patients achieved complete response, all of them were free from B-symptoms (0.0%), by examination all of them showed no lymphadenopathy (0.0%), no splenomegaly (0.0%), no hepatomegaly (0.0%). Performance status (PS) of nine patients from those who achieved complete response were zero (69.2%), four patients were one (30.8%), PS median was 1. All the patients with complete response were free from central nervous system (CNS) infiltration and extranodal sites infiltration. Three patients show elevated LDH (10.7%) as shown in Table 5.
Parameters |
Complete response i |
Partial response ii |
Progressive iii |
s Stable disease iv |
|||||
Number of patients |
Percentage of patients (%) |
Number of patients |
Percentage of patients (%) |
Number of patients |
Percentage of patients (%) |
Number of patients |
Percentage of patients (%) |
||
B- symptoms |
0 |
0.0 |
4 |
66.7 |
1 |
50.0 |
3 |
42.8 |
|
Lymphadenopathy |
0 |
0.0 |
3 |
50.0 |
2 |
100.0 |
4 |
57.0 |
|
Splenomegaly |
0 |
0.0 |
0 |
0.0 |
1 |
50.0 |
2 |
28.5 |
|
Hepatomegaly |
0 |
0.0 |
0 |
0.0 |
0 |
0.0 |
0 |
0.07 |
|
Performance status |
0 |
9 |
69.2 |
1 |
16.7 |
1 |
50.0 |
2 |
28.5 |
1 |
4 |
30.8 |
5 |
83.3 |
1 |
50.0 |
5 |
71.4 |
|
Median (range) |
1 (0-1) |
1 (0-1) |
1.5 (0-1 ) |
1(0-1) |
|||||
CNS infiltration |
0 |
0.0 |
0 |
0.0 |
0 |
0.0 |
0 |
0.0 |
|
Elevated LDH |
3 |
10.7 |
5 |
83.3 |
1 |
50.0 |
3 |
42.8 |
|
Extra nodal sites infiltration |
0 |
0.0 |
4 |
66.7 |
1 |
50.0 |
0 |
0.0 |
Table 5 Clinical parameters of patients after treatment
iTotal number of patients is 13.
iiTotal number of patients is 6.
iiiTotal number of patients is 2.
ivTotal number of patients is 7.
Patient achieve partial response: Re-evaluation of patients’ condition after chemotherapy revealed that 6 patients achieve partial response, four of them had B-symptoms (66.7%). By examination, three of them showed lymphadenopathy (50.0%), no splenomegaly (0.0%), and no hepatomegaly (0.0%). PS of one patient from those who achieved partial response was zero (16.7%), five patients were one (83.3%), PS median was one. All the patients with partial response were free from CNS or extranodal infiltration (0.0%). Five patients showed elevated LDH (83.3%) as shown in Table 5.
Patient show no response or stable disease: Re-evaluation of patients’ condition after chemotherapy revealed 7 patients with no response or stable disease, three of them had B-symptoms (42.8%). By examination, four of them showed lymphadenopathy (57.0%). Two patients had splenomegaly (28.5%), and all of them had no hepatomegaly (0.0%). PS of two patients from those showed no response or stable disease was zero (28.5%), five patients were one (71.4%), PS median was 1. All patients were free from CNS or extranodal infiltration (0.0%), and three patients showed elevated LDH (42.8%) as shown in Table 5.
Patient show progression: Re-evaluation of patients’ condition after chemotherapy revealed 2 patients with progression, one of them was still suffering from B-symptoms (50.0%). By examination, both showed lymphadenopathy (100.0%), one of them showed splenomegaly (50.0%). The two patients were free from hepatomegaly (0.0%). PS of one patient from those with progressive response was zero (50.0%), and the other was one (50.0%), PS median was 1.5. All patient with progressive response were free from CNS infiltration (0.0%), one patient has extranodal site infiltration, and one patient shows elevated LDH (50.0%) as shown in Table 5.
Analysis of FTIR results and its correlation with clinical data (Tables 6,7)
Using P values, ranges 3500-2800, 1700-1600, 1580-1480, 1380-1325, 1240-1190,11401000, and 1800-900 cm-1 were elected.15–17 Example of FTIR spectra that have measured is illustrated in Figure 5.
Ranges (cm-1) |
Range No. |
True positive (TP) (%) |
True negative (TN) (%) |
False positive (FP) (%) |
False negative (FN) (%) |
Specificity (%) |
Sensitivity (%) |
Precision (%) |
Accuracy (%) |
3500- 2800 |
RG1 |
40.74 |
7.41 |
40.74 |
11.11 |
15.38 |
78.57 |
50.00 |
48.15 |
1700- 1600 |
RG2 |
3.70 |
11.11 |
37.04 |
48.15 |
23.08 |
7.14 |
9.09 |
14.81 |
1580- 1480 |
RG3 |
11.11 |
44.44 |
3.70 |
40.74 |
92.31 |
21.43 |
75.00 |
55.56 |
1380- 1325 |
RG4 |
29.63 |
14.81 |
33.33 |
22.22 |
30.77 |
57.14 |
47.06 |
44.44 |
1240- 1190 |
RG5 |
51.85 |
3.70 |
44.44 |
0.00 |
7.69 |
100 |
53.85 |
55.56 |
1140- 1000 |
RG6 |
51.85 |
7.41 |
40.74 |
0.00 |
15.38 |
100 |
56.00 |
59.26 |
1800- 900 |
RG7 |
29.63 |
14.81 |
33.33 |
22.22 |
30.77 |
57.14 |
47.06 |
44.44 |
Table 6 Confusion matrix parameters results of each range
Wave number ranges (cm-1) |
Range no |
Biomolecular assignment |
3500-2800 |
RG1 |
• N––H and O––H symmetric stretching of amide A band. • CH3 asymmetric stretching of lipid acyl chains. • CH2 asymmetric stretching of lipids. • C––H stretching of lipid acyl chains. |
1700-1600 |
RG2 |
• C9O, C––N and C––N––N stretching of amide I of proteins; αhelical, β-pleated sheet, β-turns, random coils and side-chain structures. • C9O stretching of lipids. |
1580-1480 |
RG3 |
• N––H bending and C––N stretching of amide II of proteins; α-helical, β-pleated sheet, unordered conformation structures. • CH3 bending of methyl groups of proteins. |
1380-1325 |
RG4 |
C––H deformation due to CH3/CH2 bending of groups in α and β anomers of lipids and proteins. |
1240-1190 |
RG5 |
Phosphate I asymmetric stretching of PO2 of phospholipids, nucleic acid and phosphate. |
1140-1000 |
RG6 |
C––O, C––C stretching, C––H bending, PO2 symmetric stretching of carbohydrates; deoxyribose/ribose, and nucleic acids; DNA, RNA. |
1800-900 |
RG7 |
• Left-handed DNA helix DNA (Z form). • C––O stretch of glucose. • C––C––N backbone. • C––C stretch. • Glucose. • Biomolecular assignment of RG2-6. |
Table 7 Biomolecular assignments of ATR–FTIR spectrum bands of a whole serum dried film. Adapted from15–18
All datasets
The ensemble discriminant classifier with 5-fold cross validation was used and the accuracy was 100%. The model predicted two normal cases which, in the clinical data, were in a complete response.
Range 3500-2800 cm-1 (RG1): By selecting the range of 3500 to 2800, the linear SVM classifier was trained with 5-fold cross validation. It gave accuracy 48.15%, sensitivity 78.57%, specificity 15.38% . The model predicted only two normal cases of the after-treatment data that were complete response from the clinical data. On the other hand, it defined the other cases of the after treatment as abnormal cases. In details, three cases were detected as normal cases after treatment, two of them were partial response and one was progressive in clinical data. By using the same range, data were trained by 4- and 10-fold cross validation in training the linear SVM classifier. It gave the same results. However, by using the 2-fold cross validation, the results were as follows; accuracy 87.1%, TPR 89.3%, and FNR 85.3%.
Range 1700-1600 cm-1 (RG2): SVM classifier with 5-fold cross validation was used and It gave accuracy 14.81%, sensitivity 7.14%, specificity 23.08%. The model predicted 16 cases as normal, three of them were truly normal (true classes) while the others were false classes (not normal but between stationary, stable and partial response).
Range 1580-1480 cm-1 (RG3): In this range, accuracy was 55.56%, sensitivity 21.43%, specificity 92.31. The model detected three samples as abnormal and the remaining were normal. While one case was a complete response, the remaining two were stable disease. On the other hand, in the clinical data, 12 cases were normal (complete response), the other 9 cases are not normal cases.
Range 1380-1325 cm-1 (RG4): The range was trained by the classifier SVM with 5-fold cross validation. The confusion matrix results were as follows; accuracy 44.44%, sensitivity 57.14%, specificity 30.77%. The model predicted only 4 truly complete response cases and 2 partial responses as normal cases. It detected two cases, which were stable disease at clinical data, as normal.
Moreover, it detected one case, which is partial response in clinical data, as normal.
Range 1240-1190 cm-1 (RG5): The range was trained by the ensemble subspace discriminant with 5-fold cross validation. It gave accuracy 55.56%, sensitivity 100%, specificity 7.69%.
The overall error was 11.3%. The model predicted only one normal case after therapy, which was actually a complete response in clinical data.
Range 1140-1000 cm-1 (RG 6): By using the ensemble discriminant with 5-fold cross validation; accuracy was 59.26%, sensitivity was 100%, specificity was 15.38%. The model detected only two truly normal cases and detected the remaining as abnormal.
Range 1800-900 cm-1 (RG7): By using the ensemble discriminant with 5-fold cross validation, model gave accuracy 44.44%, sensitivity 57.14%, specificity 30.77%. The SVM gave the same accuracy.
The model predicted ten after therapy cases as normal while the true cases that are actually normal from the clinical data were only five. On the other hand, it predicted one case as normal while it was progressive. The error in this model was in three cases which are stable disease. Regarding accuracy, the best result seen in ranges 6 (RG6), 5 (RG5), 3 (RG3), and 1 (RG1) with percentages 59.26%, 55.56%, 55.56% and 48.51%, respectively. Least accuracy seen in range 2 with percentage 14.81% as illustrated in Figure 3.
Regarding sensitivity, the best 2 ranges are range 6 (RG6) and 5 (RG5) with sensitivity equals to 100% for both ranges, while range 2 (RG2) had the least sensitivity (7.14%) as illustrated in Figure 3.
MicroRNAs (miRNAs) are less susceptible to degradation due to their small size. They act as regulators for gene expression by binding to mRNA containing their complementary sequence and suppressing the translation process.18,20 miRNAs were found to be a prognostic marker for DLBCL. miR21, miR23A, miR27A, miR34A and miR18A were found to be downregulated in patients with lower overall survival (OS), while low miR19, miR195, miRLET7G and miR181A were associated with shorter event free survival (EFS).4
In our study, RG5 and RG6 had the best TP, FN and accuracy results among all tested spectral ranges, TN, FP, specificity and precision were the best results for RG3, and sensitivity was the best for RG1. RG5 and RG6 represent frequencies that are assigned to bonds found in nucleic acids. These findings indicate that, in the whole serum sample of lymphoma patients, one of the most important changes are in the nucleic acids according to Figure 4 which illustrates the main idea of SVM.
Serum lactase dehydrogenase (LDH) and beta-2-microglobulin are often increased in DLBCL patients [5]. LDH level is used as one of the five scoring factors used in IPI21 LDH structure is comprised of 40% alpha helices and 23% beta sheets22 with amide I region containing a substantial amount of C9O stretch of the polypeptide chain.23
RG2 represents the amide I region and C9O stretches that are major structures of LDH. However, RG2 confusion matrix parameters results were not as good as expected. This might due to the interference of C9O stretching of lipids along with other common protein secondary structures that are characteristic of RG2 spectra.
Beta-2-microglobulin (β2M) gene mutation was reported along with other immune-surveillance genes in DLBCL patients.24,25 About half of the amino acid residues in β2M participate in two large beta structures, one of four strands and the other of three, linked by a central disulfide bond.26 RG1 and RG3 significant confusion matrix parameters results might be due to the direct correlation between DLBCL and β2M.
FTIR spectra ranges that could be recommended to be used as diagnostic and prognostic tool for DLBCL patients are 3500-2800,1580-1480 and 1240-1000 cm-1. Those ranges had the most promising confusion matrix parameters results. The finding that the most important changes occurs in the nucleic acids may has therapeutic implication. However, further interdisciplinary studies on larger numbers of patients are required to determine the major biomolecular changes that are related to lymphoma despite the difference in biological markers between patients according to the diseased organ.
None.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
©2023 Mabed, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.