Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

MANAGERIAL BUSINESS ANALYTICS | MGT782 |

MANAGERIAL BUSINESS ANALYTICS


MGT782

DATASETS ASSIGNMENT

TITLE:

FACTORS THAT HELP PREDICT HEART DISEASE

Prepared by:

NAME: MATRIX NO:

KHYIRIN KHALISAH BINTI


MOHAMAD NIZAM 2022845002

NUR NABERLA NATASHA BINTI 2022243552


EDDI MARZUKI

4 DECEMBER 2022

PREPARED FOR: DR TAN PECK LEONG

i
TABLE OF CONTENT

INTRODUCTION 1

OBJECTIVES 2-3

DESCRIPTIVE STATISTIC

4–7
A. UNIVARIATE
8 – 12
B. BIVARIATE

HYPOTHESIS 13 – 16

CONCLUSION & RECOMMENDATION 17

REFERENCES

ii
INTRODUCTION

The dataset that was chosen is a heart disease prediction made by Kamil Pytlak. This is from 2020
annual CDC survey data of adults related to their health status. The dataset originated from the CDC
and is a significant component of the Behavioral Risk Factor Surveillance System (BRFSS), which
conducts annual telephone surveys to collect information on Americans' health conditions. As stated
by the CDC: "BRFSS, which started out with 15 states in 1984, currently gathers data from all 50
states, the District of Columbia, and three U.S. territories. The BRFSS is the largest continually
running health survey system in the world. Data from 2020 are included in the most recent dataset.

Heart disease is one of the major causes of death for persons of most races in the US, as stated by
the CDC. High blood pressure, high cholesterol, and smoking are the three main risk factors for heart
disease that at least half of all Americans (47%) have. Other significant indicators are the presence
of diabetes, obesity (high BMI), a lack of physical activity, and excessive alcohol consumption. In
the field of medicine, it is crucial to identify and combat the causes that have the biggest influence
on heart disease. Therefore, these factors are the independent variables in these datasets.

Most columns consist of inquiries into respondents' health status, such as "Did you have a stroke?
/Have you ever had diabetes?” or "Do you have a lifetime cigarette smoking total of at least 100? 5
packs equal 100 cigarettes ". Since he found numerous variables (questions) in this dataset that either
directly or indirectly affect heart disease, he chose to take the most pertinent ones out and clean it up
so that it may be used in machine learning applications. Kamil has app called machine learning
models. He created this app for those who might have concerns regarding the health of their heart
and this app will assist you in making a diagnosis. Hence, this app can quickly determine whether
someone has a high risk of developing heart disease by a simple yes or no question.

The original survey data from more than 300,000 US citizens from the year 2020 were used to build
a logistic regression model using an under-sampling strategy. According to Kamil, it has been
demonstrated to be superior to the random forest, and this application is built on it because it reaches
an accuracy of approximately 80%, which is fairly impressive. Moreover, he narrowed down the
number of respondents. So, the output indicates that we have approximately 810 entries
(respondents) with 18 columns. There are no null values and 14 numeric features and 4 categorical
features.
1
OBJECTIVES

According to Nagamani et al. (2019), heart disease can be caused by an unhealthy lifestyle, smoking,
alcohol, and a high fat intake, which can lead to hypertension. A healthy lifestyle and early detection
are ways that can help prevent heart disease. The primary challenge in today's healthcare is to provide
high-quality services and accurate diagnoses (Golande & Kumar, 2019). The factors that were
mentioned in the dataset are smoking, alcohol drinking, diabetic, difficulty walking and climbing
stairs, general health, asthma, kidney disease, and skin cancer.

As stated in the 2014 Surgeon General's Report on Smoking and Health, smoking is a major cause
of cardiovascular disease (CVD) and accounts for one out of every four CVD deaths. Even in people
with normal heart function, binge drinking is linked to the development of acute cardiac arrhythmia
which is an irregular heartbeat (Day & Rudd, 2019). Why they relate diabetic with heart disease is
because high blood glucose from diabetes can damage your blood vessels and the nerves that control
your heart and blood vessels then over time, this damage can lead to heart disease (CDC, 2020).

Furthermore, most cases of having difficulty walking and climbing stairs are related to lifestyle habits
like the lack of doing physical activities and other events. For general health, risk factors that cannot
be change includes age, gender, being postmenopausal, and having a premature family history of
heart disease (Agostino et al., 2020). According to Lincoff, 2018, the distinction is significant
because asthma and heart failure treatments are not the same. However, having Asthma symptoms
can double the risk of cardiovascular disease, heart attack, and stroke (Lincoff, 2018)

2
Objectives:

● To determine the most significant factor that contributes to heart disease.

● To determine the age to detect symptoms of heart disease

● To determine which gender are prone to getting heart disease

● To determine the pattern between general health and heart disease

● To determine the pattern between smoking and heart disease

● To determine the pattern between alcohol and heart disease

● To determine the pattern between diabetes and heart disease

● To determine the pattern between asthma and heart disease

DESCRIPTIVE STATISTIC

(A) UNIVARIATE ANALYSIS

Data set includes observations on 810 people who responded to a machine learning on heart disease
prediction. The case study shows the factors that can be the cause of heart disease so by doing that
they can predict or diagnose someone with heart disease.

3
NUMBER OF PARTICIPANTS

Age Range Frequency Percentage


Below 49 67 8%
50-54 40 5%
55-59 51 6%
60-64 92 11%
65-69 141 17%
70-74 140 17%
75-79 126 16%
80 or older 153 19%
Total 810 100%

Table 1

Figure 1

In order for uniformity, those in the age range starting from 18-24 to 45 - 49 are grouped together
into one due to having single-digit results. Thus, the number of respondents aged 49 and below are
67 (8%) respondents. Based on Figure 1, the highest number of respondents are those aged 80 or
older with 153 (19%) number respondents, while the lowest number of respondents are those aged
50-54 with 40 (5%) respondents. According to the graph, the number of age range continues to
increase with 51 (6%) respondents aged 55 - 59 year old, with a major jump in respondents with
92 (11%) aged 60-64 year old, 141 (17%) respondents aged 65 - 69 year old,and 140 (17%)
respondents aged 70 - 74 years old. There is a dip in respondents aged 75-79 years old, with 126
(16%) respondents, however the number increases with the next age group.

4
GENDER OF PARTICIPANTS

Gender Frequency Percentage


Female 533 66%
Male 277 34%
Total 810 100%

Table 2

Based on Table 2, there is an imbalance between the number of male and female participants.
According to the chart, the number of female respondents are at 533 (66%) people, while the
number of male respondents are at 227 (43%) people. This could lead to an inaccurate
representation for male and female population.

HEART DISEASE

Heart Disease Frequency Percentage


Yes 114 14%
No 696 86%
Total 810 100%

Table 3

Based on Table 3, it seems 696 (86%) of the respondents are not diagnosed with heart disease. On
the other hand, 114 (14%) of respondents answered that they are or believed to be diagnosed with
heart disease. It is important to be reminded that the survey was conducted through a machine
learning app, those afflicted were not diagnosed by a doctor but the results have proven to be 80%
accurate.

5
GENERAL HEALTH

General Health Frequency Percentage


Excellent 82 10%
Very Good 239 29%
Good 276 34%
Fair 150 19%
Poor 63 8%
Total 810 100%

Table 4

Based on Table 4, the highest number of respondents are those that have a good general health which
is 276 (34%), while the lowest number of respondents are those that have a poor general health which
is 63 (8%). Other general health areas like excellent have 82 (10%) respondents, very good general
health also high in respondents which is 239 (29%), and lastly, fair in general health have 150 (19%)
respondents.

SMOKING

Smoking Frequency Percentage


Yes 351 43%
No 459 57%
Total 810 100%
Table 5

Based on Figure 5, the above chart shows that the smaller half of respondents’ smoke cigarettes at
351 (43%) people. On the other hand, the number of respondents who do not smoke cigarettes are at
459 (57%) people.

6
ALCOHOL

Alcohol Frequency Percentage


Yes 783 97%
No 27 3%
Total 810 100%
Table 6

Based on Figure 6, the above chart shows that the smaller parts of respondents consume alcohol at
27 (3%) people. On the other hand, the number of respondents who do not consume alcohol are at
783 (97%) people.

DIABETIC

Diabetic Frequency Percentage


No 624 77%
Yes 186 23%
Total 810 100%
Table 7

Based on Figure 7, the number of respondents with diabetes are 186 (23%) people, on the other hand
,the number of respondents without diabetes are 624 (77%) diabetes.

ASTHMA

Asthma Frequency Percentage


No 687 85%
Yes 123 15%
Total 810 100%

Table 8

Based on Figure 8, the number of respondents with asthma are 123 (15%) people. On the other
hand, the number of respondents without asthma are 687 (85%) people

7
(B) BIVARIATE ANALYSIS

In this analysis, we are trying to see and compare the results in order to predict early diagnoses of heart
disease, and to conclude which factor plays the most part in the diagnostic of heart disease. In terms
of demographic perspective, we will be looking at the age and gender of respondents. Next, we will
also use the type of bad habits respondents may have, such as smoking and drinking alcohol, that can
lead to being diagnosed. And lastly, the analysis will also look at the general health and existing
illnesses respondents may have, such as diabetes and asthma.

Age & Heart Disease

Figure 2

As stated in Figure 1, those aged 40 year and below are not prone to heart diseases. Those who do
have would most likely get genetics. There are modifiable risk factors (those that can be changed)
and non-modifiable risk factors (those that cannot be changed). Risk factors that you cannot change
include your age, gender, being postmenopausal, and having a premature family history of heart
disease. However, by making heart-healthy lifestyle changes and taking medications, you can reduce
your overall risk of having a heart attack or stroke (Agostino et al., 2020).

8
Based on Figure 2, we can see a steady incline of respondents answering yes towards having heart
disease. There is 1 respondent aged between 50-54, 4 respondents for aged between 55-54, 10
respondents for aged between 60-64, 18 respondents for aged between 65-69, 27 respondents for aged
between 70-74, 30 respondents for aged between 75-79, 24 respondents for aged between 80 or older.
Out of all the age range, those placed in the age range of 75-79 have the highest recording of heart
disease. According to Curtis, 2018, age is a major factor in the decline of cardiovascular functionality,
which leads to an increased risk of cardiovascular disease (CVD) in older adults. Adults over the age
of 65 are more likely to suffer from cardiovascular disease than younger people. Aging can cause
changes in the heart and blood vessels that increase a person's chance of developing cardiovascular
disease. Based on research about how lifestyle factors like physical activity, diet, and other factors
affect the "rate of aging" in a healthy heart and artery. Other organ systems, such as the muscles,
kidneys, and lungs, are likely to age and contribute to heart disease (Rodgers et al., 2019). This
concludes that symptoms of heart disease can be detected after turning 50 years old and above.

Factors Heart Disease


Diagnosed Not Diagnosed
Gender
Male 47 230
Female 67 466
Smoking
Smoker 66 285
Non-Smoker 48 411
Alcohol
Drinker 2 112
Non-Drinker 25 671
Diabetes
Diagnosed 48 138
Not Diagnosed 66 558
Asthma
Asthmatic 22 101
Not Asthmatic 92 595
General Health
Excellent 2 80
Very Good 10 229
Good 41 235
Fair 36 114
Poor 25 38

9
Table 9

Table 9 shows all the bivariate analysis of all the variables of interest with heart disease. The results
shown can help us make early predictions of being diagnosed with heart disease.

Gender & Heart Disease


As stated in the previous figure, there is a large number of differences between males and females.
Based on Figure 7, the number of male respondents that are diagnosed with heart disease are 47
participants, while female respondents diagnosed with heart disease are numbered at 67 participants.
In order to get a more accurate reading, having an equal number of male and female respondents is
necessary. If we based results on the available chart, both men and women are at risk of getting heart
disease, with the results leaning more towards female respondents.

Smoking & Heart Disease


Based on the table above, the number of respondents who are not smokers and are not diagnosed
with heart disease are 411 people, while those who are not smokers but are diagnosed with heart
disease are at 48. On the other hand, those who are smokers and are not diagnosed with heart disease
are not diagnosed with heart disease at 285 people, while those who are smokers but are diagnosed
with heart disease are 66 people. Thus, those who smoke are at higher risk of getting heart
disease.

Alcohol & Heart Disease

Based on the table above, respondents who do not drink alcohol and do not have heart disease at 671
people, while those who do not have asthma and have heart disease are at 25 people. On the other
hand, respondents who drink alcohol but do not have heart disease are at 112 people, while those
who do drink alcohol and have heart disease are at 2 people. Thus, those who drink alcohol are not
prone to getting heart diseases.

10
Diabetic & Heart Disease
Based on the table above, respondents who don’t have diabetes and do not have heart disease at at
558 people, while those who do not have diabetes and have heart disease are at 66 people. On the
other hand, respondents who have diabetes and but do not have heart disease are at 138 people, while
those who do have diabetes and have heart disease are at 48 people. Thus, those who have diabetes
are not prone to getting heart diseases.

Asthma & Heart Disease

Based on the table above, respondents who don’t have asthma and do not have heart disease at at 595
people, while those who do not have asthma and have heart disease are at 92 people. On the other
hand, respondents who have asthma and but do not have heart disease are at 10 people, while those
who do have asthma and have heart disease are at 22 people. Thus, those who have asthma are not
prone to getting heart diseases.

General Health & Heart Disease

Based on the table above, respondents with excellent health but are not diagnosed with heart disease
are at 80 people, while those with excellent health but are diagnosed with heart disease are 2 people.
Respondents with very good health but are not diagnosed with heart disease are 229 people, while
those with very good health but are diagnosed with heart disease are 10 people. Respondents with
good health but are not diagnosed with heart disease are at 235 people, while those with good health
but are diagnosed with heart disease are with the most amount of people numbered at 41.
Respondents with fair health but are not diagnosed with heart disease are at 114 people, while those
with good health but are diagnosed with heart disease are 36 people. Lastly, respondents with poor
health but are not diagnosed with heart disease are at 38 people, while those with good health but are
diagnosed with heart disease are 25 people. Thus, people of any health status are prone to getting
diagnosed with heart disease.

11
Based on the results shown, we can conclude that the age to diagnose the existence of heart disease
is at 50 years and above. The gender most like to be diagnosed with heart disease are females. The
factor that contributes the most towards heart disease is the bad habit of smoking. The other factors,
which are drinking alcohol, diabetes and asthma, are shown to not have relation with heart disease.
And lastly, the factor of general health is not shown to have a correlating relation with heart disease.

12
HYPOTHESES

SECTION 1: DEMOGRAPHIC PERSPECTIVE

This section would like to compare and test the relationship between the demographic factors of
respondents and heart disease.

Hypothesis 1
In this section, we would like to test the resulting correlations between the age and heart disease. Figure
2 has shown a steady increase in number of respondents diagnosed with heart disease, with the lowest
number of people aging 49 and below, and the highest numbers are respondents aged 75 and above.
The hypothesis written is as followed:

H0: There is no correlation between Age and Heart Disease


H1: There is a correlation between Age and Heart Disease

Based on the results shown, we can identify that the older the group age the most likely they are at
getting heart disease with most effected aged 80 and above. Thus, we reject H0.

Hypothesis 2
In this section, we would like to test the correlation between gender and heart disease. Table 9 has
shown a difference in numbers, with females numbering more than male respondents at 67 and 47
people respectively. The hypothesis written is as followed:

H0: There is no correlation between Gender and Heart Disease


H1: There is a correlation between Gender and Heart Disease

Based on the results shown, the number of female respondents diagnosed with heart disease are more
than male respondents. Thus, we reject 0.

13
SECTION 2: EXISTING BAD HABITS
This section would like to compare and test the relationship between the bad habits respondents may
have and heart disease.

Hypothesis 3
In this section, we would like to test the correlation between smoking and heart disease. Table 2 shows
that the number of smokers diagnosed with heart disease is more than the number of those who do not
smoke. The hypothesis written is as followed:

H0: There is no correlation between Smoking and Heart Disease


H1: There is a correlation between Smoking and Heart Disease

Based on the results, it shows that people who smoke cigarettes are more likely to get heart disease.
Thus, we reject H0.

Hypothesis 4
In this section, we would like to test the correlation between drinking alcohol and heart disease. Table
2 shows that the number of drinkers diagnosed with heart disease is less than the number of those who
do not drink. The hypothesis written is as followed:

H0: There is no correlation between Drinking alcohol and Heart Disease


H1: There is a correlation between Drinking alcohol and Heart Disease

Based on the results, it shows that people who drink alcohol are less likely to get heart disease. Thus,
we accept H0.

14
SECTION 3: GENERAL HEALTH & EXISTING ILLNESSES
This section would like to compare and test the relationship between the existing illnesses respondents
may have and heart disease.

Hypothesis 5
In this section, we would like to test the correlation between diabetes and heart disease. Table 2 shows
that the number of diabetic respondents with heart disease is less than the number of those who are not
diabetic. The hypothesis written is as followed:

H0: There is no correlation between Diabetes and Heart Disease


H1: There is a correlation between Diabetes and Heart Disease

Based on the results, it shows that people with diabetes are less likely to get heart disease. Thus, we
accept H0.

Hypothesis 6
In this section, we would like to test the correlation between asthma and heart disease. Table 2 shows
that the number of asthmatic people with heart disease is less than the number of those who are not
asthmatic. The hypothesis written is as followed:

H0: There is no correlation between Asthma and Heart Disease


H1: There is a correlation between Asthma and Heart Disease

Based on the results shown, the amount of respondents who have asthma are not likely to get heart
disease. Thus, we accept H0.

15
Hypothesis 7
In this section, we would like to test the correlation between general health and heart disease. Table 2
shows that people with good health are numbered with the most diagnosed with heart disease in
comparison to other health categories listed (Excellent, Very Good, Fair, and Poor). The hypothesis
written is as followed:

H0: There is no correlation between general health and Heart Disease


H1: There is a correlation between general health and Heart Disease

Based on the results, it shows that people from different health categories can get heart disease. Thus,
we accept H0.

16
CONCLUSION & RECOMMENDATION

Both men and women are at risk of a heart attack or stroke. Men are more likely than women to develop
heart disease in middle age. As they get older, the risk increases. For women, after menopause, an
increase in the risk of heart attack and heart disease, high blood pressure, and stroke. Cardiovascular
disease, lung cancer, and musculoskeletal disorders are among the leading causes of poor health in
men in their forties. Men, for example, die at twice the rate of females from coronary heart disease and
lung cancer (ABS, 2021). Regular medical check-ups and lifestyle choices that promote good health
not only for the heart, but also lungs, brain, and muscles. Things you can do to reduce health risks in
your forties and fifties include losing weight and reducing alcohol consumption, quit smoking, being
physically active in order to manage high blood pressure and lowering cholesterol and eating a healthy
diet.

One of the most important steps people can take to improve their health is to quit smoking. Quitting
smoking might be hard for heavy smokers and also they will face multiple effects after they stop
smoking because nicotine withdrawal can cause headaches, affect your mood, and sap your energy.
While quitting smoking earlier in life has greater health benefits, it is also one of the most major
considerations smokers can do to lower their risk of cardiovascular disease. According to the U.S.
Department of Health and Human Services, 2020, there are many benefits to quitting smoking no
matter if you are diagnosed with heart disease or not. It will enhance health status and increase life
expectancy, lowers the risk of a variety of negative health outcomes, not just cardiovascular disease.
Then, lowers the risk of dying prematurely, decreases the risk of death from heart disease, and the risk
of having a first or second heart attack. What smokers can do is consider nicotine replacement therapy.
Nicotine replacement therapy can help to reduce these cravings. Studies show that using nicotine gum,
lozenges, or patches in conjunction with a quit-smoking program increases your chances of success.

17
How diabetes affects heart disease is that high blood sugar levels can cause damage to blood vessels
and the nerves that control your heart over time. According to the CDC, Diabetes patients are also
more likely to develop heart failure. Heart failure is a severe condition, but it does not indicate that the
heart has stopped beating; rather, it indicates that your heart is unable to adequately pump blood. This
can cause swelling in your legs and fluid buildup in your lungs, making breathing difficult. Heart
failure worsens over time, but early diagnosis and treatment can help relieve symptoms and prevent or
delay the condition from worsening. The most important thing to do to reduce heart disease or take
care of your heart actually all comes down to one thing which is a great lifestyle. As mentioned before,
all of the things you can do to reduce health risks are also essential in taking care of your heart too.

Last but not least, systematic preventive examination. Scheduled preventive examinations and timely
visits to a cardiologist should become the norm for people at risk for the development of pathologists
of the heart and blood vessels. The same applies to people who report an increase in blood pressure
when measured independently. Do not neglect the recommendations of your doctor. Compliance with
these rules for the prevention of cardiovascular diseases will significantly reduce the risk of their
development.

18
REFERENCES
1. Australian Bureau of Statistics. Causes of death, Australia. 2021.
https://www.abs.gov.au/statistics/health/causes-death/causes-death-australia/2020
2. Agostino, J. W., Wong, D., Paige, E., Wade, V., Connell, C., Davey, M. E., ... & Banks, E.
(2020). Cardiovascular disease risk assessment for Aboriginal and Torres Strait Islander adults
aged under 35 years: a consensus statement. Medical Journal of Australia, 212(9), 422-427.
3. Babar, A., Lak, H., Chawla, S., Mahalwar, G., & Maroo, A. (2020). Metastatic melanoma
presenting as a ventricular arrhythmia. Cureus, 12(4).
4. Beckerman, J. (2021). Alcohol and Heart Disease. https://www.webmd.com/heart-
disease/heart-disease-alcohol-your-heart
5. Curtis, A. B., Karki, R., Hattoum, A., & Sharma, U. C. (2018). Arrhythmias in patients≥ 80
years of age: pathophysiology, management, and outcomes. Journal of the American College
of Cardiology, 71(18), 2041-2057.
6. Day, E., & Rudd, J. H. (2019). Alcohol use disorders and the heart. Addiction, 114(9), 1670-
1678.
7. Golande, A., Kumar P.T, (2019),” Heart Disease Prediction Using Effective Machine Learning
Techniques”, International Journal of Recent Technology and Engineering (IJRTE), Vol.8,
No.1S4, pp.944-950.
8. Hopkins Medicine. (n.d.). Smoking and Cardiovascular Disease.
https://www.hopkinsmedicine.org/health/conditions-and-diseases/smoking-and-
cardiovascular-disease
9. Lincoff, N. (2018). Having Asthma Could Double Your Risk of a Heart Attack.
https://www.healthline.com/health-news/asthma-could-double-your-heart-attack-risk-
111614#1
10. Nagamani, T., Logeswari, S., & Gomathy, B. (2019). Heart disease prediction using data
mining with mapreduce algorithm. International Journal of Innovative Technology and
Exploring Engineering (IJITEE) ISSN, 2278-3075.
11. Nutrient reference values (NRVs) for Australia and New Zealand, 2018, Australian National
Health and Medical Research Council.
12. Rajdhan, A., Agarwal, A., Sai, M., Ravi, D., & Ghuli, P. (2020). Heart disease prediction using
machine learning. International Journal of Research and Technology, 9(04), 659- 662.
13. Robinson, J. (2021). 13 Best Quit-Smoking Tips Ever. https://www.webmd.com/smoking-
cessation/ss/slideshow-13-best-quit-smoking-tips-ever
19
14. Rodgers, J. L., Jones, J., Bolleddu, S. I., Vanthenapalli, S., Rodgers, L. E., Shah, K., ... &
Panguluri, S. K. (2019). Cardiovascular risks associated with gender and aging. Journal of
cardiovascular development and disease, 6(2), 19.
15. U.S. Department of Health and Human Services. Smoking Cessation: A Report of the
Surgeon General. Atlanta, GA: U.S. Department of Health and Human Services, Centers for
Disease Control and Prevention, National Center for Chronic Disease Prevention and Health
Promotion, Office on Smoking and Health, 2020

20

You might also like