Statistics And Analysis Research Paper Research Papers Example
New WowEssays Premium Database!
Find the biggest directory of over
1 million paper examples!
In this paper we have illustrated the basic techniques of statistical analysis and hypothesis testing relating to a real-world business problem. This analysis helps the company management to better understand the production strategy. It was found that there is a strong association between height of the student and his (her) shoe size. The approximation model was constructed to make forecasts of shoe size with given height of a student. It also appeared that the sizes of shoes of men and women are significantly different. That’s why the company can’t produce one type of sport shoes for both genders. It should separate its production cycles.
The data was found in Internet (http://www.amstat.org/publications/jse/jse_data_archive.htm). This data contains 408 observations of students’ shoe size, height and gender. The data is useful in demonstration of correlation and regression analysis. The data will be considered from the point of view of sport shoes manufacturer. Assume that the management of some sport shoes company is interested if there is the relationship between height and shoe size for males and females. If this relationship exists, how strong is it and how the results of this statistical analysis can be useful for productivity optimization. For example, if there is no difference between male and female shoe size (on average), the company can produce unisex models and do not separate the production line by gender.
In this paper we investigate the following questions:
What is the relationship between height and shoe size
If there is a significant difference between male and female shoe size
How is to determine the approximate shoe size of the given student given his (her) height.
Descriptive Statistics. Probability Concepts and Distributions
Descriptive statistics is the branch of statistics dealing with the description, organization, and simply transform data of the study. Calculate all the descriptive statistics for the data set, you can make some conclusions on the studies. These conveniently visualized by different graphs. The histogram can estimate the mean and variance of the data range. It should, however, take into account that for the distribution of data, different from the normal, the highest column histogram responsible fashion, rather than the arithmetic mean. In the case of the normal distribution infinitely large set of data values of the arithmetic mean, median and mode will tend to a single value.
We calculate the descriptive statistics for 3 variables: gender, height and shoe size. But first recode gender variable as dummy variable. Let males will be represented as “1” and females as “0”. The following table shows the results of descriptive statistics:
Descriptive Statistics: Size; Gender; Height
Variable N N* Mean SE Mean StDev Sum Minimum Q1 Median
Size 408 0 9,908 0,102 2,066 4042,500 5,000 8,000 10,000
Gender 408 0 0,5417 0,0247 0,4989 221,0000 0,0000 0,0000 1,0000
Height 408 0 68,421 0,209 4,212 27915,820 60,000 65,000 68,000
Variable Q3 Maximum Range Skewness Kurtosis
Size 11,000 15,000 10,000 0,10 -0,38
Gender 1,0000 1,0000 1,0000 -0,17 -1,98
Height 72,000 81,000 21,000 0,11 -0,55
Size variable. The average shoe size is 9.908 with a standard deviation of 2.066. The smallest shoe size in the data sample is 5 and the biggest is 15. The data is almost symmetrical (Skewness is 0.1) and do not significantly different from Gaussian shape (Kurtosis is -0.38). The histogram below shows how the data fits normal curve:
We see that the data is a little bit right skewed and a little sharper that Gaussian curve. But the distribution of the data seems not to be significantly different from the normal distribution.
Gender variable. There are 221 males and 187 females were observed. The distribution of this variable is binomial. This variable will be as factor variable in our research. The histogram below represents the data frequency:
Height Variable. The average height in the sample is 68.421 inches with a standard deviation of 4.212 inches. The smallest student has a height of 60 inches and the highest has 81 inches. The data is almost symmetrical (Skewness is 0.11) and do not significantly different from Gaussian shape (Kurtosis is -0.55). Consider the frequency distribution of the variable:
The data seems very close to the normal curve. It doesn’t significantly different from the normal distribution.
As one of the tasks given by the company management is related to the difference of shoe size and height between males and females, we should divide the sample by gender factor and store the descriptive statistics separately for each gender.
Descriptive Statistics: Size; Height
Variable Gender N N* Mean SE Mean StDev Sum Minimum Q1
Size 0 187 0 8,286 0,101 1,380 1549,500 5,000 7,500
1 221 0 11,281 0,0989 1,470 2493,000 7,000 10,500
Height 0 187 0 65,249 0,212 2,899 12201,500 60,000 63,000
1 221 0 71,106 0,212 3,149 15714,320 63,000 69,000
Variable Gender Median Q3 Maximum Range Skewness Kurtosis
Size 0 8,000 9,500 12,000 7,000 0,16 -0,44
1 11,000 12,000 15,000 8,000 0,37 0,61
Height 0 65,000 67,000 74,000 14,000 0,40 0,16
1 71,000 73,000 81,000 18,000 0,06 0,28
We can see that the average shoe size for females is 8.286 inches and for males it is 11.281 inches. Same situation with height – females are 65.249 inches on average and males are taller – averagely 71.106 inches. We do not overload the paper with separate frequency histograms for each variable, just want to show that the descriptive statistics between genders give us an idea that there is a significant difference between males and females in shoe size and height.
According to the calculations in Minitab 16 statistical software we obtained that 95% confidence interval for mean difference in shoe size by gender is the following:
95% CI for difference: (-3,272; -2,717)
As 0 value is not within this interval, we are 95% confident that there is a significant difference between females shoe size and males shoe size. Females have lesser size than males (at 5% level of significance).
Here is 95% CI for mean difference in height by gender:
95% CI for difference: (-6,446; -5,268)
As 0 value is not within this interval, we are 95% confident that there is a significant difference between females height and males shoe height. Females are smaller than males (at 5% level of significance).
We are interested in checking 2 statistical hypotheses:
There is an association between students’ height and shoe size (for whole sample)
There is a significant difference between females and males shoe size
Now we test each hypothesis with appropriate statistical tests.
We use correlation analysis here. First of all, construct scatter plot of the data. If the company management wants to know which size of shoes to produce for different groups of students (i.e. basketball players, runners, football players, etc.), it is natural to assume that Height variable is a predictor and Shoe Size variable is a response variable.
Null hypothesis will be: There is no association between students’ height and shoe size
H0: ρ=0Ha: ρ>0
Set level of significance alpha:
Correlations: Height; Size
Pearson correlation of Height and Size = 0,871
P-Value = 0,000
Since p-value is lesser than alpha level, we reject the null hypothesis. We are 95% confident that there is a significant association between shoe size and height of students in the sample.
Pearson’s r is reported at the level of 0.871. This is an evidence of very strong positive linear relationship between the variables.
After such conclusion it will be useful to construct a regression equation to predict Shoe Size with given Height.
Run regression analysis:
Regression Analysis: Size versus Height
The regression equation is
Size = - 19,3 + 0,427 Height
Predictor Coef SE Coef T P
Constant -19,3266 0,8202 -23,56 0,000
Height 0,42728 0,01196 35,71 0,000
S = 1,01664 R-Sq = 75,9% R-Sq(adj) = 75,8%
Analysis of Variance
Source DF SS MS F P
Regression 1 1318,2 1318,2 1275,39 0,000
Residual Error 406 419,6 1,0
Total 407 1737,8
Since F-value of ANOVA test is 1275.39 with corresponding p-value of less than 0.001, the model is significant. The p-values of coefficients are also less that 0.001 hence, the factors are also significant.
The equation to predict shoe size by height is:
Here we use Student’s t-test for mean difference.
H0: μ1=μ2Ha: μ1≠μ2
Set level of significance alpha:
Two-Sample T-Test and CI: Size; Gender
Two-sample T for Size
Gender N Mean StDev SE Mean
0 187 8,29 1,38 0,10
1 221 11,28 1,47 0,099
Difference = mu (0) - mu (1)
Estimate for difference: -2,994
95% CI for difference: (-3,272; -2,717)
T-Test of difference = 0 (vs not =): T-Value = -21,19 P-Value = 0,000 DF =
Since p-value of the test is lesser than 0.001 and also lesser than level of significance alpha, we reject null hypothesis. We have enough evidence to state that there is a significant difference in shoe sizes between males and females (at 5% level of significance).
Cohen, Jerome B. (December 1938). "Misuse of Statistics". Journal of the American Statistical Association (JSTOR) 33 (204): 657–674.
Moses, Lincoln E. (1986) Think and Explain with Statistics, Addison-Wesley, ISBN 978-0-201-15619-5 . pp. 1–3
Chance, Beth L.; Rossman, Allan J. (2005). "Preface". Investigating Statistical Concepts, Applications, and Methods. Duxbury Press. ISBN 978-0-495-05064-3.
Freedman, D.A. (2005) Statistical Models: Theory and Practice, Cambridge University Press. ISBN 978-0-521-67105-7
Hand, D. J. (2004). Measurement theory and practice: The world through quantification. London, UK: Arnold.
Please remember that this paper is open-access and other students can use it too.
If you need an original paper created exclusively for you, hire one of our brilliant writers!
- Paper Writer
- Write My Paper For Me
- Paper Writing Help
- Buy A Research Paper
- Cheap Research Papers For Sale
- Pay For A Research Paper
- College Essay Writing Services
- College Essays For Sale
- Write My College Essay
- Pay For An Essay
- Research Paper Editor
- Do My Homework For Me
- Buy College Essays
- Do My Essay For Me
- Write My Essay For Me
- Cheap Essay Writer
- Argumentative Essay Writer
- Buy An Essay
- Essay Writing Help
- College Essay Writing Help
- Custom Essay Writing
- Case Study Writing Services
- Case Study Writing Help
- Essay Writing Service
- Size Research Papers
- Discrimination Research Papers
- Height Research Papers
- Information Research Papers
- Difference Research Papers
- Gender Research Papers
- Statistics Research Papers
- Theory Research Papers
- Company Research Papers
- Value Research Papers
- Students Research Papers
- Hypothesis Research Papers
- Distribution Research Papers
- Significance Research Papers
- Education Research Papers
- Entrepreneurship Research Papers
- Confidence Research Papers
- Shoes Research Papers
- Association Research Papers
- Management Research Papers
- Relationships Research Papers