Mammogram+Interpretation+Final

__ MAMMOGRAM INTERPRETATION __

** Tuan Doan, Devin Rottiers, Andrea Tang **

**__Executive Summary__** This report analyzes observable characteristics in a mammogram that could be used to determine the need for biopsy. Over 70% of biopsies are deemed unnecessary, and the proposed model would help reduce misclassifications of benign and malignant tumors. Using an Auto Classifier Node, the team attempted to determine the best model in predicting the severity of the tumors. However, the major finding was that four of the variables measured in an X-ray were all linked to a single component. The team’s hypothesis that age and mass margin are most important in determining cancer was insufficiently verified. Further research should be conducted using additional measures of cancer symptoms to improve the model. The team discussed these measures with medical professionals (See Appendix 1 and 2).

**__ Project Background __** A mammogram is an x-ray picture of the breast, used to check women (or men) for breast cancer, periodically, with or without symptoms of breast-cancer. The x-ray images make it possible to detect tumors that cannot be seen or felt, or used in diagnosis of cancer upon symptoms. It is recommended that most women and men perform regular breast self-exams (BSE), to find any changes in the size, shape, or feel of their breasts. Women over age 40 are advised to undergo yearly mammograms to check for changes. The exhibit below shows normal, benign, and cancerous mammograms.

Exhibit 1 []

Mammograms are the most effective method for breast cancer screening used currently. But, as you can see from the above exhibit, the benign cyst looks similar on the mammogram to the cancerous tumor and could potentially have other similar symptoms. It is easy to see how interpretation of these results could be skewed. Annual screening mammograms miss up to 20% of breast cancers present at the time of screening. ( []) False negatives occur mainly due to breast density, the tissue that makes up the breast can have a similar density to tumors, making them harder to detect. Seventy percent of biopsies completed after an abnormal mammogram are said to be false positive. These occur mainly in younger women and women with a history, or family history of breast cancer or breast cancer symptoms. A false positive mammogram can lead to anxiety and other forms of psychological distresses to the patient. Although it is important that healthcare professionals rule out potential cancer, it is extremely difficult for the patient to rely solely on her mammogram results and family history in the preliminary diagnosis of breast cancer.

**__ Business challenge __** There is a high correlation between secondary check-ups or biopsies and benign outcomes. It is stated that 70% of biopsies are deemed unnecessary with benign outcomes. Because of this high statistic, several computer-aided diagnosis systems have been proposed. These systems aim to assist physicians in their decision to seek follow-up medical procedures, such as a breast biopsy or a short-term follow-up examination instead.

** __Business goal__ ** Our goal is to create a model that will predict the likelihood of the mass seen in a mammogram being benign or malignant based on characteristics of sample masses. The proposed model would be a supplement to the current method of breast cancer screening to increase accuracy in forward action of abnormal mammograms.

**__ Hypothesis __** The team’s hypothesis is that age and mass margin are key predictors in determining whether a mass is malignant or benign.

**__ Data-Mining Goals __** · The project needs to be able to assist in the prediction of malignant tumors in abnormal mammograms · Identify the characteristics that are the best predictors of the prevalence of breast cancer · Determine the accuracy of the model based on Lift chart, Misclassification chart, and overall accuracy.

**__Procedure__** **Data Interpretation** The data set includes 961 instances of women who received biopsies after an abnormal mammogram. It includes 516 benign instances, and 445 malignant instances. There are 6 attributes (1 goal field, 1 non-predictive, 4 predictive attributes) which are listed as follows: 1) BI-RADS Assessment: 1 to 5 (ordinal) 2) Age: patient's age in years (integer) 3) Shape: mass shape: round=1 oval=2 lobular=3 irregular=4 (nominal) 4) Mass margin: circumscribed=1 microlobulated=2 obscured=3 ill-defined=4 spiculated=5 (nominal) 5) Density: mass density high=1 iso=2 low=3 fat-containing=4 (ordinal) 6) Severity: benign=0 or malignant=1 (binominal)

// BI-RADS (Breast Imaging- Reporting and Data System): // BI-RADS refers to a number based on presumed malignancy that a radiologist will apply after interpreting a mammogram. It is categorized from 1 to 6 being: 1) Negative 2) Benign Finding(s) 3) Probably Benign 4) Suspicious Abnormality 5) Highly Suggestive of Malignancy 6) Known biopsy—Proven Malignancy

// SHAPE //

[] [] [] []
 * Round || Oval || Lobular || Irregular ||
 * [[image:Untitled2.png width="134" height="185"]] || [[image:Untitled3.png width="187" height="187"]] || [[image:Untitled4.png width="146" height="188"]] || [[image:Untitled5.png width="149" height="186"]] ||

// MASS MARGIN // Circumscribed: Well-defined Microlobulated: Having many small lobes Obscured: Partially hidden by tissue Ill-defined: Blurry Spiculated: Having small ‘needle-like’ sections

// DENSITY // Density refers to the density of the tumor and the amounts of fatty elements present. A mass (cyst or tumor) becomes more suspicious for breast cancer when it appears denser. It suggests that the mass is composed of malignant cancer cells. ( []) There are some missing attribute values listed, which will be discarded. This will not compromise the results of the model; there is sufficient data after pre-processing.

** Data Preparation ** It is necessary for the data to be cleaned to generate higher predictive accuracy in various testing models. This data preparation requires the following steps: 1. Noise removal: Discard outliers and handle missing data 2. Transform: Examine the distribution of the data and convert skew variables into normally-distributed variables (z-scores) if necessary 3. Feature select: Identify variables that are correlated and show significant in predicting cancer severity 4. Partition: Divide the dataset into training and validation sets

** Model Development ** After preprocessing and cleaning the data, a total of 393 records were used in the model development. The team conducted a correlation analysis, and all variables appeared to be highly correlated except for density. Two streams were built—one with standardized data, and the other one with non-standardized data. In the first stream, PCA Factor reduced the five variables to two components. Surprisingly, density was the second component. In the second stream, Feature Select confirmed the significance of four variables in determining the severity of a tumor. However, the fifth variable density was close to 1, so it was not eliminated.

After partitioning the data into a training and validation set, a distribution chart indicated an imbalance of 0, or benign occurrences, in the data, which prompted the team to use a balance factor of 4.038 to correct the spread of data. An Auto Classifier ran 8 different models on both the standardized data and the non-standardized data.



** Model Performance Evaluation ** The model determining whether a tumor was malignant or benign was based on observable characteristics seen on a mammogram. An Auto-Classifier using the above models identified CART and Logistic Regression models as the most accurate models. Key metrics that were used in determining which models to analyze were Lift Chart, Overall Accuracy, and Misclassification costs.

Before balancing, we had an overall accuracy percentage of about 90%. We had many more instances of severity classified as “0,” which is why the accuracy of predicting a “0” instance was high. We used a classification matrix table and compared the two models--one with PCA (standardized) and one with Feature Select (non-standardized). The team examined lift charts which indicated the model’s performance compared to a random sample.

**__ Analyzing Results __** ** Model Performance Comparison **

1. K-Nearest Neighbor: In completing a K-NN model with a feature select, the lift chart was 2.208 and overall accuracy was 77.84%. We ran the K-NN model after standardizing the data, and the lift chart was 2.227 and the overall accuracy percentage was 76.05%. We chose not to use this model because there were other models that were more accurate in predicting severity.

2. Logistic regression: We calculated a logistic regression using the standardized data and the non-standardized data. Data that had been standardized produced a higher accuracy model compared with the non-standardized data. The model had the highest overall classification accuracy of 83.23% and a lift of 2.338 after Feature Selection.

Logistic model misclassification chart (Feature Select)

Logistic Regression: Lift Chart

3. Classification trees: CART was our highest accuracy with an 82.64% accuracy, and a lift chart of 2.415 for non-standardized data. Accuracy in the PCA Factor model was 81.44% and a lift of 2.22.

CART Misclassification Chart (PCA) CART Lift Chart

4. Neural Nets: The Neural network model was discarded by the auto classifier node. This may be due to the fact that there are no hidden layers between the input and output layers.

5. Discriminant Analysis: Using standardized data, the discriminant analysis produced a model with 79.641% accuracy, and the lift chart was higher than that of the CART model at 2.338. Tend to classify 1s as 0s, there are 89 cases in which 1’s are classified as 0s (malignant tumors as benign). **Relationship, Model, and Focus** (talk about the data itself—predictor importance) The team determined that the models are inconclusive after discovering a high level of redundancy in four of the variables. PCA Factor showed that Age, Margin, Shape, and BI-RADs load on only one component. Models using standardized data, reduced variables to only two components, which we feel is insufficient data to predict benign or malignancy in tumors. Regardless, the CART does indicate certain risk factors that would help a doctor diagnose the likelihood of cancer. For example, if a patient was over 40 years old and the shape is either round or lobular, then the risk of cancer would increase.

**__Project Continuation__** To conduct further research, the team met with Teresa Rosetto, RN, and Dr. Phillip Newman, M.D. Their interviews are recorded in the attached appendices. The team learned that there are multiple tests and information that can be gathered to further refine our model. Dr. Newman stated that high risk groups can be determined based on medical history (pregnancies, menstruation and menopause ages, method of contraception, etc.) and family cancer history. Genetic information plays a big role in determining the onset of breast cancer; however it is a recent and expensive development. A lot of women with family history of breast cancer are hesitant to complete genetic therapy to find out if they have the BRCA1 or BRCA2 gene, for a number of reasons. There is an antigen called CA15-3 which when elevated can lead to the detection of breast cancer, as well as ovarian cancer. Change in appearance, shape or feel of the breast can also be good indicators. Based on these findings, a stronger model could be created including the 5 predictors used in the model, along with:


 * Age of onset of menstruation
 * Age of onset of menopause (if relevant)
 * Number of pregnancies (if relevant)
 * Contraceptive methods (if relevant)
 * Number of direct relatives with breast cancer
 * CA15-3 level
 * BRCA1/BRCA2 gene
 * White blood cell count
 * Blood Protein test results

**__Conclusion__** Throughout our model, the team has identified the most important variables in predicting malignancy of tumors. BI-RAD 4 and above, indicating suspicious abnormality, is likely to predict breast cancer. However, a BI-RAD assignment is subjective because it is assigned by a physician, which can differ in opinion and experience with cancerous tumors. Age seems to be a better indicator, and there is a definite determination point at age 40, and 65 and older. In regards to shape, lobular and irregular are causes for concern and should be biopsied. However, some of the measurements seen in an x-ray or mammogram will overlap, or correlate, which means that confidence level in predicting the severity is less than originally thought. The team sought out more information from health care professionals to define other predictors that could also be important in determining severity, that may not be as correlated or dependent on each other. According to Dr. Newman, a surgical oncologist, doctors must meet a standard of care with their surgeries. He stated that if 100% of the biopsies result in malignant diagnoses, then they are not carrying out enough surgeries. It is important to get a certain error rate to ensure that the doctor is maintaining a proactive approach, rather than waiting until it is too late to take action. He mentioned that an error rate of about 10% is a good industry average to strive for. Hopefully, with the help of statistical or computer animated model, the error rate of breast cancer biopsies can be significantly decreased from 70%.

Appendix 1: Interview with Teresa Rosetto, Registered Nurse, Oregon Health Science University //Ms. Rosetto:“A////n oncologist can look at a lot of things like// //urine cytology// //to see how well your organs are working to determine cancer diagnosis.// //Blood protein tests can see how good your immunity is, basic CBCs (Complete Blood Count), as well as watching white blood cell counts also tests immunity. Urological oncologists order tests to look at liver enzymes, and there are a lot of medical imaging tests that can be completed.”//

Appendix 2: Interview with Dr. Phillip Newman, MD Surgical Oncologist – Retired 2008 Dr: Newman: “//You are looking to define cancer in women which is a moving target – all of medicine we have a decision point, especially in which operative intervention is needed, to find a standard of intervention or care. I started in the 1960s, there were no CT scanners, ultrasounds, etc. so the diagnosis resided on clinical observations mostly Blood test - to see if white blood cell elevated which usually means that the immune system is working. With cancer, it is important to operate quickly, but you don’t want to operate on patients that don’t need it. Most hospitals gauge clinical “experiences,” for example, if 100% of the individuals you operated on had cancer, you weren’t doing enough, you were waiting too long until you were sure and if 30% had normal biopsies, you are operating on too many of false positives.//

//The standard of care, if you were evaluating the normal practices or to see if the hospital is providing adequate care is 10%. As you increase the accuracy you can hone that down to about 5%, but you always want to have some false negatives to know that you intervene soon enough, and to save the patients from full blown disease. You don’t want to operate on a lot of people. Fortunately for men, they have this sensitive PSA (Prostate Specific Antigen) in detecting prostate cancer, men over 50 have to get tested for. A huge number of men have had to get biopsies and testectomies. Further research into the natural history of prostate cancer has found that a lot of the PSA doesn’t progress to the actual cancer. There is the same problem in breast cancer in that there is an unpredictable like history, some breast cancers don’t even need to be discovered because they may not even progress, and some are already disseminated by the time you find them. So rushing to make a diagnosis might not be helpful.// //The main concept of these two diseases is that there is a mutation, and then an orderly progression to the disease in the basal membrane of the cells in the organ.//

//There are definitely some ways that you can increase efficiency of normal practices of biopsies. I would start by looking at the characteristics of the mammographic changes in the way it looks on x-ray. Also whether or not the woman has had babies and how many, or been pregnant before, or their method of contraception. You should look at the age of onset of menstruation or menopause, family cancer history, or genetic information (There also may be a link between BRCA 1 and BRCA 2, the “breast cancer gene,” with other cancer). Any of these characteristics would put women in high risk groups. There are MRI studies now for women who have really dense breasts, which are harder to decipher through a traditional mammogram or ultrasound. Clinical Characteristics are also good predictors, whether you can feel it under the skin, if it’s tender, skin changes, skin puckering over the mass. Also, there are marker studies that are used, like the PSA antigen I mentioned before, for breast cancer. These are called cancer antigen 15-3 or CA15-3, which is used for monitoring breast cancer in levels are increased, and for early detection of ovarian cancer which could potentially be correlated with breast cancer. This antigen is a normal protein produced by the breast tissue, but when cancer is present, the increase in CA 15-3 can increase the number of cancer cells.”//