Statistics And Analysis Research Paper Research Papers Example

Type of paper: Research Paper

Topic: Size, Discrimination, Height, Information, Difference, Gender, Statistics, Theory

Pages: 7

Words: 1925

Published: 2020/11/05

Summary

In this paper we have illustrated the basic techniques of statistical analysis and hypothesis testing relating to a real-world business problem. This analysis helps the company management to better understand the production strategy. It was found that there is a strong association between height of the student and his (her) shoe size. The approximation model was constructed to make forecasts of shoe size with given height of a student. It also appeared that the sizes of shoes of men and women are significantly different. That’s why the company can’t produce one type of sport shoes for both genders. It should separate its production cycles.

Data Collection

The data was found in Internet (http://www.amstat.org/publications/jse/jse_data_archive.htm). This data contains 408 observations of students’ shoe size, height and gender. The data is useful in demonstration of correlation and regression analysis. The data will be considered from the point of view of sport shoes manufacturer. Assume that the management of some sport shoes company is interested if there is the relationship between height and shoe size for males and females. If this relationship exists, how strong is it and how the results of this statistical analysis can be useful for productivity optimization. For example, if there is no difference between male and female shoe size (on average), the company can produce unisex models and do not separate the production line by gender.

In this paper we investigate the following questions:

What is the relationship between height and shoe size
If there is a significant difference between male and female shoe size
How is to determine the approximate shoe size of the given student given his (her) height.
Descriptive Statistics. Probability Concepts and Distributions
Descriptive statistics is the branch of statistics dealing with the description, organization, and simply transform data of the study. Calculate all the descriptive statistics for the data set, you can make some conclusions on the studies. These conveniently visualized by different graphs. The histogram can estimate the mean and variance of the data range. It should, however, take into account that for the distribution of data, different from the normal, the highest column histogram responsible fashion, rather than the arithmetic mean. In the case of the normal distribution infinitely large set of data values of the arithmetic mean, median and mode will tend to a single value.
We calculate the descriptive statistics for 3 variables: gender, height and shoe size. But first recode gender variable as dummy variable. Let males will be represented as “1” and females as “0”. The following table shows the results of descriptive statistics:

Descriptive Statistics: Size; Gender; Height

Variable N N* Mean SE Mean StDev Sum Minimum Q1 Median
Size 408 0 9,908 0,102 2,066 4042,500 5,000 8,000 10,000
Gender 408 0 0,5417 0,0247 0,4989 221,0000 0,0000 0,0000 1,0000
Height 408 0 68,421 0,209 4,212 27915,820 60,000 65,000 68,000
Variable Q3 Maximum Range Skewness Kurtosis
Size 11,000 15,000 10,000 0,10 -0,38
Gender 1,0000 1,0000 1,0000 -0,17 -1,98
Height 72,000 81,000 21,000 0,11 -0,55
Size variable. The average shoe size is 9.908 with a standard deviation of 2.066. The smallest shoe size in the data sample is 5 and the biggest is 15. The data is almost symmetrical (Skewness is 0.1) and do not significantly different from Gaussian shape (Kurtosis is -0.38). The histogram below shows how the data fits normal curve:
We see that the data is a little bit right skewed and a little sharper that Gaussian curve. But the distribution of the data seems not to be significantly different from the normal distribution.
Gender variable. There are 221 males and 187 females were observed. The distribution of this variable is binomial. This variable will be as factor variable in our research. The histogram below represents the data frequency:
Height Variable. The average height in the sample is 68.421 inches with a standard deviation of 4.212 inches. The smallest student has a height of 60 inches and the highest has 81 inches. The data is almost symmetrical (Skewness is 0.11) and do not significantly different from Gaussian shape (Kurtosis is -0.55). Consider the frequency distribution of the variable:
The data seems very close to the normal curve. It doesn’t significantly different from the normal distribution.
As one of the tasks given by the company management is related to the difference of shoe size and height between males and females, we should divide the sample by gender factor and store the descriptive statistics separately for each gender.

Descriptive Statistics: Size; Height

Variable Gender N N* Mean SE Mean StDev Sum Minimum Q1
Size 0 187 0 8,286 0,101 1,380 1549,500 5,000 7,500
1 221 0 11,281 0,0989 1,470 2493,000 7,000 10,500

Height 0 187 0 65,249 0,212 2,899 12201,500 60,000 63,000

1 221 0 71,106 0,212 3,149 15714,320 63,000 69,000

Variable Gender Median Q3 Maximum Range Skewness Kurtosis

Size 0 8,000 9,500 12,000 7,000 0,16 -0,44
1 11,000 12,000 15,000 8,000 0,37 0,61

Height 0 65,000 67,000 74,000 14,000 0,40 0,16

1 71,000 73,000 81,000 18,000 0,06 0,28
We can see that the average shoe size for females is 8.286 inches and for males it is 11.281 inches. Same situation with height – females are 65.249 inches on average and males are taller – averagely 71.106 inches. We do not overload the paper with separate frequency histograms for each variable, just want to show that the descriptive statistics between genders give us an idea that there is a significant difference between males and females in shoe size and height.

Confidence Intervals

According to the calculations in Minitab 16 statistical software we obtained that 95% confidence interval for mean difference in shoe size by gender is the following:

95% CI for difference: (-3,272; -2,717)

As 0 value is not within this interval, we are 95% confident that there is a significant difference between females shoe size and males shoe size. Females have lesser size than males (at 5% level of significance).

Here is 95% CI for mean difference in height by gender:

95% CI for difference: (-6,446; -5,268)
As 0 value is not within this interval, we are 95% confident that there is a significant difference between females height and males shoe height. Females are smaller than males (at 5% level of significance).

Statistical Hypotheses

We are interested in checking 2 statistical hypotheses:
There is an association between students’ height and shoe size (for whole sample)
There is a significant difference between females and males shoe size
Now we test each hypothesis with appropriate statistical tests.
Hypothesis #1
We use correlation analysis here. First of all, construct scatter plot of the data. If the company management wants to know which size of shoes to produce for different groups of students (i.e. basketball players, runners, football players, etc.), it is natural to assume that Height variable is a predictor and Shoe Size variable is a response variable.

Null hypothesis will be: There is no association between students’ height and shoe size

H0: ρ=0Ha: ρ>0

Set level of significance alpha:

α=0.05

Perform testing:

Correlations: Height; Size
Pearson correlation of Height and Size = 0,871
P-Value = 0,000
Since p-value is lesser than alpha level, we reject the null hypothesis. We are 95% confident that there is a significant association between shoe size and height of students in the sample.
Pearson’s r is reported at the level of 0.871. This is an evidence of very strong positive linear relationship between the variables.

After such conclusion it will be useful to construct a regression equation to predict Shoe Size with given Height.

Run regression analysis:
Regression Analysis: Size versus Height
The regression equation is
Size = - 19,3 + 0,427 Height

Predictor Coef SE Coef T P

Constant -19,3266 0,8202 -23,56 0,000
Height 0,42728 0,01196 35,71 0,000
S = 1,01664 R-Sq = 75,9% R-Sq(adj) = 75,8%

Analysis of Variance

Source DF SS MS F P
Regression 1 1318,2 1318,2 1275,39 0,000
Residual Error 406 419,6 1,0
Total 407 1737,8
Since F-value of ANOVA test is 1275.39 with corresponding p-value of less than 0.001, the model is significant. The p-values of coefficients are also less that 0.001 hence, the factors are also significant.

The equation to predict shoe size by height is:

Shoe Size=-19.3266+0.42728*Height

Hypothesis #2

Here we use Student’s t-test for mean difference.
H0: μ1=μ2Ha: μ1≠μ2

Set level of significance alpha:

α=0.05

Perform testing:

Two-Sample T-Test and CI: Size; Gender
Two-sample T for Size
Gender N Mean StDev SE Mean
0 187 8,29 1,38 0,10
1 221 11,28 1,47 0,099
Difference = mu (0) - mu (1)

Estimate for difference: -2,994

95% CI for difference: (-3,272; -2,717)
T-Test of difference = 0 (vs not =): T-Value = -21,19 P-Value = 0,000 DF =

401

Since p-value of the test is lesser than 0.001 and also lesser than level of significance alpha, we reject null hypothesis. We have enough evidence to state that there is a significant difference in shoe sizes between males and females (at 5% level of significance).

Works Cited

Cohen, Jerome B. (December 1938). "Misuse of Statistics". Journal of the American Statistical Association (JSTOR) 33 (204): 657–674.
Moses, Lincoln E. (1986) Think and Explain with Statistics, Addison-Wesley, ISBN 978-0-201-15619-5 . pp. 1–3
Chance, Beth L.; Rossman, Allan J. (2005). "Preface". Investigating Statistical Concepts, Applications, and Methods. Duxbury Press. ISBN 978-0-495-05064-3.
Freedman, D.A. (2005) Statistical Models: Theory and Practice, Cambridge University Press. ISBN 978-0-521-67105-7
Hand, D. J. (2004). Measurement theory and practice: The world through quantification. London, UK: Arnold.

Cite this page
Choose cite format:
  • APA
  • MLA
  • Harvard
  • Vancouver
  • Chicago
  • ASA
  • IEEE
  • AMA
WePapers. (2020, November, 05) Statistics And Analysis Research Paper Research Papers Example. Retrieved April 19, 2024, from https://www.wepapers.com/samples/statistics-and-analysis-research-paper-research-papers-example/
"Statistics And Analysis Research Paper Research Papers Example." WePapers, 05 Nov. 2020, https://www.wepapers.com/samples/statistics-and-analysis-research-paper-research-papers-example/. Accessed 19 April 2024.
WePapers. 2020. Statistics And Analysis Research Paper Research Papers Example., viewed April 19 2024, <https://www.wepapers.com/samples/statistics-and-analysis-research-paper-research-papers-example/>
WePapers. Statistics And Analysis Research Paper Research Papers Example. [Internet]. November 2020. [Accessed April 19, 2024]. Available from: https://www.wepapers.com/samples/statistics-and-analysis-research-paper-research-papers-example/
"Statistics And Analysis Research Paper Research Papers Example." WePapers, Nov 05, 2020. Accessed April 19, 2024. https://www.wepapers.com/samples/statistics-and-analysis-research-paper-research-papers-example/
WePapers. 2020. "Statistics And Analysis Research Paper Research Papers Example." Free Essay Examples - WePapers.com. Retrieved April 19, 2024. (https://www.wepapers.com/samples/statistics-and-analysis-research-paper-research-papers-example/).
"Statistics And Analysis Research Paper Research Papers Example," Free Essay Examples - WePapers.com, 05-Nov-2020. [Online]. Available: https://www.wepapers.com/samples/statistics-and-analysis-research-paper-research-papers-example/. [Accessed: 19-Apr-2024].
Statistics And Analysis Research Paper Research Papers Example. Free Essay Examples - WePapers.com. https://www.wepapers.com/samples/statistics-and-analysis-research-paper-research-papers-example/. Published Nov 05, 2020. Accessed April 19, 2024.
Copy

Share with friends using:

Related Premium Essays
Contact us
Chat now