Free Portugal Business Practical Analysis Report Example
Essential 1: Data Summary
Testing For Heteroscedasticity and Autocorrelation
The main assumption in linear regression is that the variance of the error term is a constant. In the event that the error term will not have a similar constant variance, the terms will be in heteroscedastic thus would have an effect on the accuracy of the analysis. The implication here is that linear regression will not provide unbiased estimators with the smallest variance. In addition, depending on the nature of Heteroscedasticity, significant tests could be too low or too high. According to Allison, when Heteroscedasticity is found to be present, all the observation are given a similar weight while it is known that the variables that have high level of turbulences provide less information than those that have low level turbulences (Rao, 2008: Seber & Lee, 2003). Also, since the standard error is biased, it follows that the test statistics and confidence intervals will be affected negatively. The data was found to comply with the assumption that the variance of the error term is a constant. As such, the data is suitable for making good estimate using the linear regression.
Chart one: Heteroscedasticity residual data plot
The data that is being used has elements in time series data. As a result, autocorrelation may occur thus adversely affecting the ability to use linear regression method. Therefore, a Durbin-Watson test is conducted to ensure that the error term from different periods does not correlate. The effect of having autocorrelation is underestimating the standard errors that are associated with the parameter estimates thus making them appear more significant than they are. It is observed that D-W is close to 2 (Rumsey, 2007). As a result, it implies that there is no serial correlation.
Essential Three and Two:
Suitable Statistics test to evaluate whether there are in the overall mean score according to:
Whether the firm is multinational or not
Type of ownership
Size of the firm
The data does not have a management score value. Therefore, this value has to be computed. The management score is measured using three parameters namely
The parameters have several observations. First, the averages of each of the parameters are obtained by the total value of all observation by the number of observation. The values that are obtained are also averaged. The average of the averages of the three parameters is used as the measure of management score.
Variation Table1: Show the range and standard deviation of variable average management score against who own the firm variable. Variation table 2: Show the range and standard deviation of variable average management score against average weekly labor hour’s variable
Variation table 3: Show the range and standard deviation of variable average management score against whether the firm is multinational or not
Paired Samples Correlations
In order to detect the existence of variation in the data that is used, a paired test is applied to the variables that are being considered. A paired test is the appropriate test in this study since observation from one sample can be paired with observation in the other sample. However, there are four assumptions that have been fulfilled before the test can be carried out. The assumptions are:
The dependent variables have to be measured using a continuous scale
The independent variables consist two categorical groups "matched pairs" or “related groups". The related groups have same subjects are present in both groups
There must be no significance outliers that exist between the two groups
The differences distribution in the dependent variable between the two related groups is approximately normally distributed (Chalmer, 1987).
The null hypothesis that is being tested is that
H1: at least one Bi ≠ 0
Examining Table 5, the value of T is found to be 198.822 that is greater than the table value of F3345. Therefore, the null hypothesis is rejected and the alternative hypothesis is accepted since the overall regression is statistically significant at the specified level of confidence. As such, the meaning is that there is 95% chance that the linear relationship that exists between the dependent variables and independent variables is not likely to change (Weisberg, 2013).
Studying the table enable enables the researcher to develop the model that is used to predict the sales. In this case, the model is obtained to be as follows
Sales value = -312987 +71097.34X1- 1516514X2+3.751X3+46256.314X4
Where: X1 is the average management score
X2 is the average weekly working hours
X3 is the company fixed asset
X4 is who owns the company
Also, evaluating the t test value, only the company fixed asset and who owns the company has a statistical significance since their t test value is greater than value 2 (Jackson, 2010). The result is confirmed by the significance value since the significance value for the two variables 0.05.
The model can be used to make prediction on the effect on sales. However, the value that is obtained is only 69.7% probable to be realized. For instance, firm with 15000 weekly labor hours, a mean management score of 3.5, total company assets of 30,000 dollars and which is not owned by the founder, the sales value will be determined as follows
Sales value = -312987 +71097.34X1- 1516.514X2+3.751X3+46256.314X4
Sales value = -312987 +71097.34 (3.5) – 1516.514(15000) +3.751 (30000) +46256.314 (4)
The company would not be making any sales since the values of the sales as per the model is negative.
Further manipulating the data with a focus on average management score and assuming the firm is not owned by the founder, the table 7 summarized the minimum condition to make the sale of one dollar.
Examining the table, it shows that a rise in the management score leads to a change positive change in value of sales. As such, the management style matters since better management styles will lead to higher average management score thus resulting to better sales results.
Further manipulation aimed at evaluating whether breaking down the measure of management score into its three constituent parameters would affect the ability of model, the following model summary is obtained.
Table9: Breaking down average management score into 3 model summary
Comparing the model with when the model is not broken down, there is a slight decrease in the value of R squared. As such, breaking down the measure of management score does not have significant effect on the predicting capability of the model.
Considering whether a firm nature, being a multinational or not, affects the model ability to predict the sales, this can be done by including this variable from the model.
The outcome shows that there is a slight negative change. As such, the nature of the firm, whether the firm is a multinational, local multinational or foreign, does not have any meaningful influence on the power of the model. The observation is confirmed by running a regression of this variable against sales. The following model summary is obtained
The results have been based on liner relationship. However, it is possible to examine the data using non liner relationship. However, for this case, management score is tested to observe whether it could have a nonlinear relationship with sales. Using curve fitting to an exponential curve, the following summary is obtained
At times, there are several ways that data can be expressed. One of these alternatives is the expressing the data as natural log of the values that were obtained. As such, sales, labor hours and capital are expressed as their natural logs and a regression carried out as per the original model.
The power of the model improves to over 70%. As such, this shows that expressing the three variables improves the power of the model thus making this approach better. Also, this model improves the number of models that are statistically significant in the model and simplifies the values of B values as shown in table 13.
Chalmer, B. (1987). Understanding statistics. New York: M. Dekker.
Chatterjee, S., & Hadi, A. (1988). Sensitivity analysis in linear regression. New York: Wiley.
Jackson, S. (2010). Statistics: Plain and simple (2nd ed.). Belmont, CA: Wadsworth/Cengage Learning.
Montgomery, D., & Peck, E. (2012). Introduction to linear regression analysis (5th ed.). Hoboken, NJ: Wiley.
Motulsky, H., & Christopoulos, A. (2004). Fitting models to biological data using linear and nonlinear regression: A practical guide to curve fitting. Oxford: Oxford University Press.
Plichta, S., & Garzon, L. (2009). Statistics for nursing and allied health. Philadelphia: Wolters Kluwer/Lippincott Williams & Wilkins Health.
Rao, C., & Miller, J. (2007). Handbook of Statistics Epidemiology and Medical Statistics. Burlington: Elsevier.
Rao, C. (2008). Epidemiology and medical statistics. Amsterdam: Elsevier.
Rumsey, D. (2007). Intermediate statistics for dummies. Hoboken, NJ: Wiley Pub.
Seber, G., & Lee, A. (2003). Linear regression analysis (Second ed.). Hoboken, N.J.: Wiley-Interscience.
Weisberg, S. (2013). Applied linear regression. Hoboken, N.J.: Wiley.