Excel Data Analysis In Management Report Examples
Type of paper: Report
Topic: Pot, Production, Temperature, Education, Information, Statistics, Value, Theory
Pages: 8
Words: 2200
Published: 2020/10/17
Summary
In this paper we have discovered the basics of descriptive statistics for statistical data of temperatures in the pots. It was determined which pot has the highest temperature on average and which has the lowest. The basic characteristics of temperature distribution (measures of central tendency and variability) for each section were given. One-sample and Two-sample Student’s t-test were applied to check the difference in AlF3 calculations methods and the amount of cryolite used in manufacturing process. It is appeared, that old and new calculation systems are significantly different from each other. Also, the hypothesis of plant management about the cryolite amount was wrong. Finally, the association between pot temperature and total production was considered and the prediction model was constructed. This model can be used to predict total production based on a given temperature in the pot. After all calculations, the necessary conclusions and recommendations from company management was given.
Introduction
In this paper, we consider the basic techniques and methods of application of mathematical statistics to a real-world problem. This case study is related to the aluminium production. According to the condition, we are given a set of data consisting of 80 observations. Each observation refers to a pot for aluminum production. All pots are divided by 4 sections and each pot is characterized by a number of variables such as pot temperature, total production, efficiency, power consumption, etc. In order to find the ways of production optimization and understanding the relationship of various production factors that influence the formation of the costs and benefits of aluminum production, we use statistical analysis.
The analysis has begun with a descriptive statistics which helps to generalize the information about the data. It shows all basic measures of central tendency and variation of the given variables. Then we check if old method of calculation of excess AlF3 is significantly different from a new method of calculation. To investigate the significance of this difference we use two-sample Student’s t-test. The next step of the analysis was related to amount of cryolite consumption per pot. We test plant manager’s hypothesis about the average cryolite consumption level and conclude if the real consumption is significantly different from hypothesized value. One sample Student’s t-test is used in this part of analysis. The last part of the analysis was related to relationship between pot temperature and total production of the pot. Correlation and regression analysis were applied to discover an association between these variables.
Step A. Descriptive statistics.
Descriptive statistics allows summarizing the initial results obtained by observation or experiment. All calculations are reduced to descriptive statistics grouping data from their values, their frequency distribution of construction, identification of central tendency and distribution, finally, to the assessment of the spread of data in relation to the central tendency found.
Presentation of descriptive statistics is usually the first step in any analysis. The purpose of presenting data in the form of descriptive statistics - to draw conclusions and make strategic (for analysis) solutions based on available data.
We provide a summary statistical analysis of the pot temperatures for each of the 4 sections separately. To do this we have to store descriptive statistics on variable “Pot temperature” by factor variable “Section number”
According to the output above we can say that there are 4 sections, 20 plots in each. The maximum pot temperature is in section #1 and the lowest is in section #4. The average temperatures are also highest in #1 and lowest in #4 The first section has the widest range of values and the forth has the lowest.
Fortunately, each indicator of skewness is within (-1,1) that’s why skewness is not substantial and the distribution of each parameter is not far from symmetrical. The distribution of first parameter is flatter distribution; others are quite close to Gaussian kurtosis.
Standard deviation and variance, as measures of variability, show how much values are dispersed from the mean value. We can see, that the most dispersed data is in Section #2 (std.dev.=62.48071) and in Section #3 the data is most close to its mean value (std.dev.=49.45173).
It is also be useful to produce a histogram of all 4 sections to visualize the data:
Step B. Old and New Calculation Methods (Two-Sample Student’s t-Test)
In this step we use two-sample Student’s t-test for difference between the mean values. This test is applied to check the difference between old calculations of excess AlF3 provided for the first and the second sections and new calculations which are now practiced for section #3 and section #4.
All varieties of the Student’s t-test are parametric tests and they based on the additional assumption of normality of the data samples. Therefore, before using the t-test is recommended to check normality. If the hypothesis of normality is rejected, you can check out other distributions, if they do not fit, then you should use the non-parametric statistical tests. Fortunately, a preliminary analysis of descriptive statistics shows that the distribution of temperature in pots is close to normal.
Null hypothesis: there is no difference in the mean pot temperature for Sections 1 and 2 combined and Sections 3 and 4 combined.
Alternative hypothesis: there is a difference in the mean pot temperature for Sections 1 and 2 combined and Sections 3 and 4 combined.
H0: μ1=μ2Ha: μ1≠μ2
Set level of significance alpha = 0.05
Perform testing in Excel:
This is two-tailed test. Since p-value of this test is lower than alpha level (0.05) we can reject the null hypothesis. We are 95% confident that there is a significant difference in the mean pot temperature for Sections 1 and 2 combined and Sections 3 and 4 combined.
Step C. Cryolite Consumption (One-Sample Student’s T-test)
We are encouraged to check the plant manager’s hypothesis about that each pot consume 2.8 kgs of cryolite during its production cycle. The purpose of this step is to compare the given value with the sample mean of our data set. Assuming normality of cryolite consumption distribution we perform one-sample Student’s t-test with a level of significance alpha of 0.05.
Null hypothesis: The average cryolite consumption is 2.8 kgs per pot.
Alternative hypothesis: The average cryolite consumption is not eqial to 2.8 kgs per pot.
H0: μ=2.8Ha: μ≠2.8
Set level of significance alpha = 0.05
Perform testing (this is two-tailed test). The test is computed in Excel:
Since p-value is lower than alpha level (0.05), we can reject null hypothesis. We claim that the average cryolite consumption is not eqial to 2.8 kgs per pot (at 5% level of significance).
Step D. Total Production and Pot Temperature (Correlation and Regression Analysis)
In the last section of our analysis we check the hypothesis about the relationship between pot temperature and its total production level. It would be logical to assume that in the case of the existence of such relationship, the response variable will be production rate and the explanatory variable will be pot temperature.
The use of the methods of correlation and regression analysis gives an opportunity to analysts to measure closeness of the connection parameters by using the correlation coefficient. In this connection are found, varying in strength (strong, weak, moderate, and others). If association will be essential, it is advisable to find it mathematical expression in the form of a regression model and evaluate the statistical significance of the model. In economics, a significant regression equation is generally used to predict the phenomenon or indicator being studied.
Therefore, the regression analysis is called the primary method of modern mathematical statistics to identify implicit and covert links between observations.
Let y = total production and x = pot temperatures. Now we use MS Excel tools to perform regression analysis on these variables. The Excel output is the following:
We have obtained that the regression equation is the following:
Production=0.0011*temperature-0.1015
The regression is significant, according to ANOVA results F-statistics is 148.2912 with p-value less than even 0.001. Moreover, the multiple R coefficient (0.809513) shows that there is a strong association between the variables.
The coefficient of pot temperature may be interpreted in the following way: the change of the temperature on 1 degree involves the corresponding change in production on 0.001107 tons.
The scatterplot above visualize data points and characterize the relationship. As we may see from the picture above, the regression line is relatively good to fit the data. The coefficient of determination R-square is 0.6553 which means that 65.53% of variance of the response variable is explained by this model.
Analyzing the coefficients of regression equation we can say that the gradient’s p-value is less than 0.05, the common level of significance alpha (see point 3, second coefficient). This means that there is an evidence of a significant relationship.
It is known that the one of the assumptions of regression analysis is the mean value of residuals must be 0. To check this assumption we construct the residual plot of the regression.
We calculate the mean value of residuals in Excel and obtain the following:
Mεi=-3,15026*10-16
As this value is very close to 0, the residuals seem to be normally distributed and the assumption is met.
In the end we show how this regression equation helps to predict total productivity level of the pot for a given temperature value.
For example, pot #118 has temperature of 878 degrees. Hence:
Production=0.0011*878-0.1015=0.8643
If pot temperature is 900 then the total production is:
Production=0.0011*900-0.1015=0.8885
If pot temperature is 850 then the total production is:
Production=0.0011*850-0.1015=0.8335
Conclusion and recommendation
The performed analysis shows that there is a real strong relationship between pot temperature and its production. That’s why management of the company must pay attention that the low temperature in pots causes decreasing of the total production level, hence, the company should avoid this issue to maximize their production level. Also it was shown that the old and new methods of calculation of excess of AlF3 is significantly different. That’s why the data must be regained using only one (old or new) method of calculation to be standardized. Finally, it is not necessary to consume 2.8 kgs of cryolite for each pot. The results of the testing show that on average, the amount of consumption is lesser than 2.8 kgs.
- APA
- MLA
- Harvard
- Vancouver
- Chicago
- ASA
- IEEE
- AMA