# Good Report About Base Statistics

Published: 2020/12/18

## Create an appropriate graphical display for total box office gross.

The variable I choose for this question is adjusted total box office gross because this variable is adjusted using consumer price index of 2008. This means that total box office gross is now measured in the terms of dollars in 2008 and can be comparable, because the values are “cleansed” from inflation factor.
This data may be represented appropriately with a number of different graphs. In my opinion, the best choice is to construct histogram of the data. In statistics, a histogram is used to visualize the distribution of the studied variables. Since in this case we want to visualize the distribution of the total box office gross from the James Bond movies, the histogram is appropriate to use. The labels may be the name of the film or the year of production.
The blue bell-shaped curve on a graph represents a normal curve of the Gaussian distribution. It’s possible to compare the distribution of Adj BO with the normal once and state that the distribution of Adj BO is left skewed but close to normal.
c)
In this case it is more appropriate to report median, because the data is skewed. It’s known that among the basic measures of central tendency median is the best for the data which is not symmetrical. Mean value is better for symmetrical data.
The median of this data is \$546.5 million dollars.
d)
Generally, it’s well-known that interquartile range (IQR) and range are not such sensitive measure of variability as standard deviation. The standard deviation will be changed if any value of the data is changed, but IQR and range may still be the same. However, when the mean value is not an appropriate measure of central tendency, standard deviation may not be useful to characterize the variability. In this case the data is skewed, hence, standard deviation is not good characteristics. Besides, the sample in this study is quite small, the standard deviation is better for large samples (n>30). That’s why I recommend using IQR and range to characterize variability of this sample.

Variable N N* Median Range IQR
adj BO 23 0 546,5 768,0 239,6

