Assume we want to investigate the following question: the size of shoes is not significantly different for males and females. That’s why it is possible to produce unisex sport shoe models without partition on male and female types.

## We begin with descriptive statistics. Code gender variable as dummy variable n the following way:

Male – 1, Female – 0
Descriptive Statistics: Shoe Size; Height
Variable Gender N N* Mean SE Mean StDev Variance Minimum Q1
Shoe Size 0 18 0 7,111 0,267 1,132 1,281 5,000 6,500
1 17 0 11,294 0,437 1,803 3,252 7,000 10,250

## Height 0 18 0 66,667 0,812 3,447 11,882 60,000 63,750

1 17 0 71,353 0,762 3,141 9,868 64,000 69,000

## N for

Variable Gender Median Q3 Maximum Range IQR Mode Mode
Shoe Size 0 7,000 7,500 10,000 5,000 1,000 6,5; 7; 7,5 4
1 11,000 12,500 14,000 7,000 2,250 11; 12 3

## Height 0 66,500 70,000 72,000 12,000 6,250 70 4

1 72,000 73,000 77,000 13,000 4,000 72 4
According to the descriptive statistics we can assume that there is a great difference in shoe size by gender (mean value for men is 11.294, for women – 7.111). To test this difference we use t-test for independent samples.
H0: μ1=μ2Ha: μ1≠μ2

α=0.05

## Perform t-test:

Two-Sample T-Test and CI: Shoe Size; Gender
Two-sample T for Shoe Size
Gender N Mean StDev SE Mean
0 18 7,11 1,13 0,27
1 17 11,29 1,80 0,44
Difference = mu (0) - mu (1)

## Estimate for difference: -4,183

95% CI for difference: (-5,236; -3,130)
T-Test of difference = 0 (vs not =): T-Value = -8,17 P-Value = 0,000 DF = 26
Since p-value of the test is lower than alpha level, we reject null hypothesis. There is enough evidence to say that shoe size is significantly different by gender (at 5% level of significance).

## That’s why the management of Nyke Corporation can’t produce one model of sport shoes for both genders independently.

Now we are interest to find the relationship between height and shoe size. We perform correlation analysis between these variables:
Correlations: Shoe Size; Height
Pearson correlation of Shoe Size and Height = 0,864
P-Value = 0,000
Since p-value of the test is lesser than 0.001, the relationship is significant. The coefficient of correlation is 0.864 which is an evidence of the strong positive linear relationship between the variables.
The next step of our analysis is to develop a forecasting model to predict shoe size of a person given the height of a person.

## Regression Analysis: Shoe Size versus Height

The regression equation is
Shoe Size = - 29,1 + 0,554 Height

## Predictor Coef SE Coef T P

Constant -29,057 3,875 -7,50 0,000
Height 0,55407 0,05612 9,87 0,000
S = 1,31837 R-Sq = 74,7% R-Sq(adj) = 73,9%

## Analysis of Variance

Source DF SS MS F P
Regression 1 169,43 169,43 97,48 0,000
Residual Error 33 57,36 1,74
Total 34 226,79
The obtained regression equation is:
Shoe Size= -29.1+0.554*Height
Since ANOVA p-value is less than 0.001 the model is significant. P-values of regression coefficients are also significant. The coefficient of determination R-square is 0.747 which means that 74.7% of Shoe Size variance is explained by this model.
In conclusion of our analysis we have developed a useful tool for Nyke Company management. Now it is possible to predict the shoe size of a person if his height is given.

