Statistics Essay Sample
For this case I have found an article about healthcare research of the following topic: “Caring Behaviors and Patients’ Satisfaction” (Nursing and Midwifery Studies, 2012). The article was retrieved from the following website:
The researchers are investigated the issue in one of the hospitals in Iraq.
The null and alternative hypotheses are the following:
Null hypothesis: there is no significant association between caring behaviors and patients’ satisfaction.
Alternative hypothesis: there is a significant association between caring behaviors and patients’ satisfaction.
The results were “p<0.001, r=0.57” (Nursing and Midwifery Studies, 2012) which means a significant positive correlation between the variables. The strength of the correlation is moderate.
According to the article, caring behaviors were measured on a Likert scale from 1 to 6, for the following five aspects: ““respect for others”, “assurance of human presence”, “communication and positive trend”, “professional knowledge and skills” and “attention to the experience of others”” (Nursing and Midwifery Studies, 2012). This variable is independent variable.
The measure of patients’ satisfaction was also scored on a Likert scale “from strongly disagree = 1 to strongly agree = 5”. This variable is a dependent variable as it is affected by nursing caring behaviors.
In order to "cleanse" a correlation between the two variables from a possible impact of the third, the concept of partial correlations is being used (Dowdy, S, 1983).
If we examine a large enough set of men, and to compare the size of their shoes with the level of education, between these two variables can be seen, though small, but at the same time, a significant correlation. This correlation can serve as an example of so-called spurious correlation. Here, a statistically significant correlation coefficient is not a manifestation of a causal relationship between the two variables of interest, and to a greater extent due to some third variable.
In this example, such a variable is height. On the one hand, there is some slight correlation between the growth and level of education and on the other - it is quite understandable and logical relationship between growth and the size of the shoe. Together, these two correlations lead to false-mentioned correlation. To exclude such distorting one variable needs calculation of so-called partial correlation.
If we examine a large enough set of men, and to compare the size of their shoes with pulse index, between these two variables can be seen, though small, but at the same time, a significant correlation. This correlation can serve as an example of so-called spurious correlation. Here, a statistically significant correlation coefficient is not a manifestation of a causal relationship between the two variables of interest, and to a greater extent due to some third variable (Croxton, F, 1968)..
In this example, such a variable is height. On the one hand, there is some slight correlation between growth and pulse and on the other - it is quite understandable and logical relationship between growth and the size of the shoe. Together, these two correlations lead to false-mentioned correlation. To exclude such distorting one variable needs calculation of so-called partial correlation.
Suppose we are interested to investigate the association between alcohol abuse and smoking. We believe that those who consume alcohol on a daily base are more likely to smoke. Assume we have performed a study and collect 100 participants to answer two questions: “Do you smoke?” and “Do you consume alcohol on a daily base?”
The results of the survey are given in the table below:
The independent variable in this case is Alcohol Consumption and the dependent variable is Smoking, because we are exploring the effect of alcohol abuse on a smoking habit.
Since p-value of the test is 0.3730 we can’t say that the association between smoking and alcohol abuse is significant (at 5% level of significant). This study does not support our hypothesis.
In this essay we show how nonparametric statistical tests may be used in real world problems in healthcare. The study I chose is “examined the prevalence of self-reported depressive symptoms and the self reported somatic depressive symptoms” (Hindawi Publishing Corporation, 2012). These indicators were measured and explored the gender impact on both variables. The article was retrieved from the link below:
The dependent variable is depression prevalence and self-reported somatic depressive symptom and the independent variable is gender.
Chi-square test was used in this study because the variables chosen are categorical variables. As a rule, the values of categorical variables are string and cannot be lined up in order. That’s why the common parametric tests are meaningless here. If there are two categorical variables, to assess the degree of dependence using standard statistics and the relevant criteria for contingency tables: Chi-square, the phi coefficient, Fisher's exact test.
According to the conclusion of the research, "results from the present study support the hypothesis that there are gender differences in depression prevalence and self-reported somatic depressive symptoms, in patients hospitalized for ACS". The results of the Chi-square test were significant (p<0.05) (Hindawi Publishing Corporation, 2012).
I have found an article which summarizes the findings in the area of canonical correlation analysis. The study was performed in rural Northern Bangladesh. The researchers investigate the relationship between “infant’s size at birth and maternal factors” (“Plos” journal, 2014). The article has been retrieved from the following website:
The null hypothesis was: There is no significant association between infant’s size and maternal factors.
The alternative hypothesis was: There is a significant association between infant’s size and maternal factors.
The researchers picked five different size measures of infant’s body as a dependent variable and various maternal socio-demographic factors as independent variables (see more in article attached).
Multiple regression analysis has been performed in this case. The assumptions of the regression analysis are not checked in this research study (or just omitted in article).
The results of the analysis were significant – “MANOVA showed a significant interaction effect of preterm delivery and infant's sex on birth size; F = 161.83, p<0.001.” The researchers concluded that the infant’s size is significantly affected by maternal factors (“Plos” journal, 2014).
The example is related to diabetes diagnostics. Assume training sample contains 768 entries with the following fields:
The incidence of pregnancy;
The concentration of glucose;
Diastolic blood pressure, mm. Hg. Article .;
Skinfold thickness of triceps, mm .;
2 hour serum insulin;
Body mass index;
A numeric parameter heredity of diabetes;
The dependent variable is (1 - the presence of the disease, 0 - no). The distribution of the dependent variable as follows: 500 cases of absence of disease, 268 - his presence.
Here we can use logistic regression to approximate the relationship between independent factors and diabetes presence (Kendall MG, 1973).
The first method how to select my variables is the expert conclusions regarding to the most significant factors which may affect the diagnostics of diabetes. Experienced doctors can list the number of most common factors which should be measured and taken in consideration for the research study.
The other method of selection may be listing all the data which is available about patients, recoding it into a proper variables and run various correlation and association tests (for example, Pearson’s, Spearman’s and Kendall’s correlation tests) to see which factors are most correlated with a dependent variable. These factors may be included in logistic regression model as the most significant factors (Kendall MG, 1973).
Croxton, F. (1968). Applied General Statistics, New York: Sir Isaac Pitman and Sons, p. 625
Dowdy, S. (1983). Statistics for Research, New York: John Wiley & Sons, p. 230
Armstrong, J. (2012). Illusions in Regression Analysis. International Journal of Forecasting (forthcoming) 28 (3), 689.
Kutner M. (2004). Applied Linear Regression Models, 4th ed., McGraw-Hill/Irwin, Boston, p. 25
Corder, G. (2014). Nonparametric Statistics: A Step-by-Step Approach. Wiley, New York.
Nikulin, M. (1973). Chi-square test for normality. Proceedings of the International Vilnius Conference on Probability Theory and Mathematical Statistics, (2), 119–122.
Greenwood, P. (1996) A guide to chi-square testing. New York: John Wiley & Sons.
Kendall M. (1973) The Advanced Theory of Statistics, Volume 2 (3rd Edition), Section 27.22
Ismail A. (2012). Correlation Between Nurses’ Caring Behaviors and Patients’ Satisfaction. Kashan University of Medical Science: Nursing and Midwifery Studies. 2012 September; 1(1): 36-40.
Canonical Correlation Analysis of Infant's Size at Birth and Maternal Factors: A Study in Rural Northwest Bangladesh. (n.d.). Retrieved March 26, 2015, from http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0094243