Good Essay About Statistics
Type of paper: Essay
Topic: Percentage, Damage, Tort Law, Hypothesis, Theory, Length, Value, Services
Pages: 5
Words: 1375
Published: 2021/03/24
There have been concerns by the Health and Safety Executive (HSE) that workers in the ceramics sector have an increased likelihood of being affected by silicosis as a result of prolonged use of raw materials which contain quartz, feldspar and china clay. The quartz component of the raw materials causes the formation fine dusts which contain small particles that in some cases contain Respirable Crystalline Silica (RCS). Excessive exposure especially for a long period of time leads to a pulmonary fibrosis (silicosis) which results in incapacity and sometimes death. The objective of this study is to access if there are alterations in the health of workers as indicated by the amount of cell damage. The analysis further attempts to establish if there is a relationship between length of service and the health effect.
Lactate dehydrogenase is an enzyme that is used to establish the amount of cell damage in body organs which include liver, ear, and muscle among other organs. Increased levels of LDH release are considered to represent elevated levels of cell membrane permeability. LDH levels in blood cells have been used in the computation of the percentage cell damage.
Methods
The requirements for a worker to take part in the study included: having worked in more than one industry, worked in sectors of actual industrial activity not in an office or regions with limited production activities, and no past or current smoking history. Random selection for blood testing was done and a sample of 127 respondents; 38 from brick, 27 from tile, 30 from crystal glass and 32 from the porcelain sector (Creswell and Clark, 388)
The variables which presents the data collected are five and they are defined below
C1 – the identification number of the worker and it includes the sector (B,T,P or C) and the three digit identification number for the individual.
C2- The sector of the employee (brick, tile, porcelain or crystal glass)
C3- Length of service in years in the particular sector
C4- Age of the respondents
C5- the percentage of damaged cells for every worker in the sample
Descriptive statistics
We would be interested in first observing the summary statistics so that we have a general idea of how the data looks like in terms of its measures of central tendency and its dispersion. This is achieved through an observation of the summary statistics which are presented in the descriptive. The measures of central tendency particularly the mean is important procedure as it presents to the researcher the average value of the measure. The mean is, however exaggerated by outliers which validates the use of the median as the measure of central tendency.
Descriptive Statistics: % damaged cells
The summary statistics indicate that the mean and median of the percentage cell damage for an employee in the crystal glass sector are comparably higher than in the other sectors. The measures of dispersion do not vary much and this shows that the percentage cell damage for all employees were not too different and there possibly were no outliers because the measures of central tendency are close to each other.
A boxplot is used to determine of the median of the percentage cell damage are comparable across the sector in which the employee is working. The boxplot is an important test when the researcher wants to present the differences in the medians graphically and also when they want to establish if there are outliers.
Boxplot of percentage cell damage
Figure 1: The boxplot of the percentage cell damage.
The boxplot indicate that the medians of the cell damage according to sector indicate that there is no observable difference in the medians. However, the median of the percentage cell damage for the sector tile appears to be lower than the others. It can also be seen that there are two outliers for the sector tile and one outlier for the sector crystal glass. This could be the reason for the high mean in the crystal glass sector but it does not explain the high median.
Most statistical tests rely on the data of the variable under consideration being identical to the normal distribution of a random variable. Therefore, after the description of the variables through summary statistics it is an essential step in any analysis to determine of the probability plots of the measure of interest (in this case the percentage cell damage)is identical to the probability plot of a normally distributed random variable. The test is important as it defines whether the tests being used in the analysis will be parametric or non-parametric. The normality plots of the variables are included below,
Crystal glass
Brick
Tiles
Porcelain
The graphs plotted indicate the data values of the variable of interest being measured and the straight line represents the idealized normal distribution graph. Although all the tests indicate that the variables are considerably normal, the graphs are not conclusive. I n most cases the parametric test which is represented by the p-value is used to conclusively state if the variable is normal or not. The preferred test criterion is 0.05 level of significance because it balances the probability of the researcher committing type I error and type II error. The tests are significant for brick and tiles sector p-value = 0.012 and p-value = 0.005 respectively. Therefore, we reject the null hypothesis that the data follows a normal distribution and conclude that the data for brick and tile sectors do not follow a normal distribution. As a consequence the statistical tests used for these variables will be non-parametric. The tests are, however, insignificant for the crystal glass and porcelain sectors p-value = 0.729 and p-value = 0.28 respectively which are both greater than 0.05. Therefore, we fail to reject the null hypothesis that the distributions of the variables follow a normal distribution. Parametric tests will be used in conducting statistical analysis for crystal glass and the porcelain data.
After the normality of the data has been established it would be of interest to test if the variances of the sectors are statistically equal. This is usually done using the Levene's test for equality of variances and the results both contain the significance when equality of variance can be assumed and when it cannot be assumed. The test is important as it dictates where the significance of equality of means will be read off from.
The p-value for the levene's test of equality of variances is insignificant at 0.05 significance level, that is 0.233 > 0.05 and as such we fail to reject the hypothesis that the variances are equal for the brick, crystal glass, porcelain and tile sectors. Therefore, the significance level of the equality of means will be computed with the assumption of equality of variances (Kumar, 60).
Hypothesis testing
The objective of the study is to determine of there are significant differences in the percentage cell damage across the different sectors.
We are testing the hypothesis:
H0 – There is no significant difference in the percentage cell damage across the sectors. That is, H0:μ1=μ2=μ3=μ4
Ha – there is significant difference in the percentage of cells damaged across the sectors Ha:notallmeansareequal
The level of significance used in the test is 0.05, that is α=0.05
The one-way ANOVA for the percentage cell damaged according to sector is given below.
One-way ANOVA: % damaged cells versus sector
S= 1,028 R-SQ = 6,87% R-SQ (adj) = 4,60%
The value of interest to consider is the p-value = 0.032. The p-value is less than the significance level which indicates that the null hypothesis is rejected and the alternative hypothesis accepted.
Therefore, we conclude that the means of the percentage cell damage are not significantly equal across all the sectors. The R2 value indicates the percentage of variance of the outcome variable that can be explained by the predictor variable.
The means are recomputed using the pooled standard deviation so that the sectors whose means are statistically different can be established.
Individual 95% CIs for Mean Based on Pooled StDev
-----+---------+---------+---------+---
(-------*--------)
(---------*--------)
(--------*--------)
(--------*---------)
------+---------+---------+---------+--
1.20 1.60 2.00 2.40
Pooled StDev = 1,028
the pooled standard deviation is 1.028
Grouping information using tukey method
The means with different letters for instance brick and tile are significantly different
Tukey 95% Simultaneous Confidence Intervals
All Pairwise Comparisons among Levels of sector
Individual confidence level = 98.96%
Sector = brick subtracted from:
---------+---------+---------+---------+
(-------*-------)
(-------*-------)
(-------*-------)
---------+---------+---------+---------+
-0.80 0.00 0.80 1.60
sector = crystal glass subtracted from:
---------+---------+---------+---------+
(-------*-------)
(-------*-------)
---------+---------+---------+---------+
-0.80 0.00 0.80 1.60
sector = porcelain subtracted from:
---------+---------+---------+---------+
(-------*-------)
---------+---------+---------+---------+
-0.80 0.00 0.80 1.60
The simultaneous pairwise comparisons is include above and it contains comparison between all the sectors under consideration.
The second hypothesis being tested is the hypothesis to test for the relationship between the length of exposure and the percentage of damaged cells (Groove, p 81).
H0 – The null hypothesis states that there is no correlation between the length of working and the percentage of cells damaged. H0:ρ=0
Ha – The alternative hypothesis states that there is a correlation between the length of working and the percentage of cells damaged. Ha:ρ≠0
The significance level is 0.05 that is a=0.05
Cell Contents: Pearson correlation
The Pearson correlation coefficient is significant for the length of service, the age and the percentage of damaged cells. Therefore, we reject the null hypothesis that the correlation coefficient is equal to zero and conclude that there is a correlation between the length of service and the percentage cell damaged (Cohen et al, p 115).
Regression Analysis
We are interested in testing the hypothesis that the coefficients of the multiple regression equation are not equal to zero. This test is important because it will enable us to establish significant coefficients of the predictor variables which can be used in the prediction of the outcome variable (Boslaugh, p 189).
The null hypothesis;
H0 – The coefficients for the predictor variables are all equal to zero.
H0:β0=β1=β2=0
The alternative hypothesis;
Ha – The coefficients for the predictor variable are not equal to zero
S = 0,940804 R-Sq = 21,4% R-Sq(adj) = 20,1%
Analysis of Variance
The test is significant for the length of service and the constant, the test is, however insignificant for age and as such we fail to reject the null hypothesis that the coefficient of age is equal to zero.
Therefore, the length of service and the constant value will be used in the prediction o the percentage cell damage.
The standardized regression equation is obtained to be,
Percentage cell damage = 0.899 + 0.072 (length of service)
The 95% confidence interval of the mean or the length of service and the percentage cell damage are obtained for the pooled standard deviation to be;
Individual 95% CIs for mean based on pools standard deviation
Means
The 95% confidence interval gives the limit within which the value can be found at 0.05 level of significance.
Conclusions
The findings of the analysis indicate that there is a significant difference in the percentage cell damage based on sector. This indicates that there are sectors whose workers have significantly high levels of cell damage and this s the crystal glass sector. Correlation analysis indicate that there is a significant correlation between the length of service, the age and the percentage of cells damaged. The regression analysis show that the length of service can be used to predict the percentage of cells damaged.
Bibliography:
Boslaugh, S., & Watters, P. A. (2008). Research design. In Statistics in a nutshell. Sebastopol, CA: O'Reilly Media, Inc. Retrieved from http://proquest.safaribooksonline.com/book/-/9781449361129
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2013). Applied multiple regression/correlation analysis for the behavioral sciences. Routledge.
Creswell, J. W., & Clark, V. L. P. (2007). Designing and conducting mixed methods research. Australian and New Zealand Journal of Public Health, 31(4), 388–389.
Grove, Susan K. (2007). Statistics for Health Care Research: A Practical Workbook. Edinburgh: Elsevier Saunders. Print.
Kumar, R. (n.d). Research Methodology. New Delhi, India. APH Publishing.
- APA
- MLA
- Harvard
- Vancouver
- Chicago
- ASA
- IEEE
- AMA