ANOVA Example Problem

Four storage procedures of milk are under study. The index of bacteria count after 60 hours of storage were tabulated. The data are shown below:

Storage Treatment,

S1 S2 S3 S4
3 4 4 5
6 7 13 6
4 9 10 8
3 2 6 7
1 5 7 3

N = (5+5+5+5) = 20
= (17+27+40+29) = 113
= (71+175+370+183) = 799

1.

2. interval data

3. ANOVA one way

4.

5.

6. Tabular Value = 5.29 at .01

w.d.f. = (3,16)


7. There are no significant difference between and among the four storage method.

Summary Table

Source of Variation SS d.f. MS F
SSB 53.35 k-1, 4-1 = 3 17.78 MSB/MSW
SSW 107.2 N-K, 20-4 = 16 6.7 = 17.78/6.7
SST 160.55 N-1, 20-1=19   =2.65

Sample Problem / Example of Skewness and Kurtosis

Class Interval f CF< CRFC Midpoint(M) fm
96 - 98 1 44 100% 97 97
93 - 95 2 43 97.73 94 188
90 - 92 3 41 93.18 91 273
87 - 89 4 38 86.36 88 352
84 - 86 3 34 77.27 85 255
81 - 83 5 31 70.45 82 410
78 - 80 5 26 59.09 79 395
75 - 77 6 21 47.73 76 456
72 - 74 4 15 34.09 73 292
69 - 71 4 11 25 70 280
66 - 68 2 7 15.91 67 134
63 - 65 2 5 11.36 64 128
60 - 62 1 3 6.82 61 61
57 - 59 2 2 4.55 58 116
  N = 44       3437

For Standard Deviation:

For Standard deviation

Class Interval f M fm fm 2
96 - 98 1 97 97 9409
93 - 95 2 94 188 17672
90 - 92 3 91 273 24843
87 - 89 4 88 352 30976
84 - 86 3 85 255 21675
81 - 83 5 82 410 33620
78 - 80 5 79 395 31205
75 - 77 6 76 456 34656
72 - 74 4 73 292 21316
69 - 71 4 70 280 19600
66 - 68 2 67 134 8978
63 - 65 2 64 128 8192
60 - 62 1 61 61 3721
57 - 59 2 58 116 6728
  N = 44   sum =3437 sum=272,591

By substituting the data in the following formulas:

Spearman Rho Coefficient of Correlation

The Spearman Rho Coefficient of Correlation is used for ordinal data.

The formula is:


Where rho is the coefficient of correlation by the Rank-Difference Method

1 is constant
6 is constant (The constant 1 and 6 in the formula do not change)
is the sum of column
N is the sum of pairs f scores or measures.

Pearson r Coefficient of Correlation or Product Moment Coefficient of Correlation

The Pearson r is used to find the correlation between interval data

a. The Pearson r from Raw Scores

The formula is:


Where:
= no. of cases
= sum of the products X and Y
= sum of X
= sum of Y
= sum of the squares of X
= sum of the squares of Y

b.) The Pearson r from standard scores

The formula is:

Where:
= is the coefficient of correlation by the Product Moment Method
= is the sum of column dxdy
= is the sum of column
= is the sum of column

Correlation

Meaning and Uses of Correlation

Correlation is a measure of relationship between two variables. Coefficient of correlation determines validity, reliability and objectivity of an examination prepared. It also indicates the amount of agreement or disagreement between groups of scores, measurements, or individuals. Correlation ranges in value from +1.00 through 0.00 up to -1.00.

Interpretation of ranges is shown below:

0 - no correlation
0.01 to 0.20 - slight correlation, almost negligible relationship
0.21 to 0.40 - slight correlation, definite but small relationship
0.41 to 0.70 - moderate correlation, substantial relationship
0.71 to 0.90 - high correlation, marked relationship
0.91 to 0.99 - very high correlation, very dependable relationship
1 - perfect correlation

Measure of Kurtosis

Kurtosis is a measure of a distribution’s peakedness (flatness)

The three (3) types of kurtosis are leptokurtic, mesokurtic, platykurtic

Leptokurtic – distributions where values cluster heavily or pile up in the center.  These are tall distribution with narrow humps and long and high tails.  Its kurtosis (Ku) is higher than 3

Mesokurtic – are intermediate distributions which are neither too peaked nor too flat.  The values are immediately distributed about the center.  Its kurtosis (Ku) is equal to 3.

Playykurtic – flat distributions with values more evenly distributed about the center with broad humps and short tails.  Its kurtosis (Ku) is less than 3.

A measure of kurtosis based on both quartiles and percentiles is given by

Where:
Q. D. = quartile deviation
k = percentile coefficient of kurtosis

*For normal distribution this has the value 0.263

Skewness of Quantiles and Percentiles

Other measure of skewness defined in terms of quartiles and percentiles are as follows:

Where:
= quartile coefficient of skewness
= percentile coefficient of skewness
= First Quartile
= Second Quartile
= Third Quartile
= 10th Percentile
= 50th Percentile
= 90th Percentile

Skewness Formula

Skewness of the degree of asymmetry, or departure from symmetry of a distribution. If the frequency curve of a distribution has a longer “Tail” to the right of the central maximum than to the left, the distribution is said to be skewed to the right or the positive skewness. If the reverse is true it is said to be skewed to the left or to have negative skewness. If the longer tail of the curve is located at the center, it is said to have symmetrical skewness. In symmetrical skewness, the mean, median and mode are all equal.

a.) Skewed to the left (negativelyu skewed): The mean and median are to the left of the mode.
b.) Symmetric (zero skewness)
c.) Skewed to the right (positively skewed): The mean and median are to the right of the mode

For skewed distributions the mean tends to lie or the same side of the mode as the longer tail. Thus a measure of the asymmetry is supplied by the difference (mean – mode)

Where:
and = skewness
= mean
= mode
= median
= standard deviation

Note: When the population is skewed to the right or left with a very long tail, the population median might be better than the population mean as a measure of central tendency.

Sample Problem for Z-Score / Standard Scores

Suppose a night shift technician of an electric company finished his work in 7.4 hours and a day shift technician of the same company finished his job in 6.9 hours.  And suppose the mean and standard deviation of the night shift technicians’ completion time are 5.5 and 0.5 hours, respectively, while the mean and standard deviation of the day shift technicians’ completion time are 6.4 hours and 0.5 hours respectively.  Which of the two technicians is a better worker relative to the shift to which he belongs?

Solution:

Night Shift:

Day Shift:


It is clear from the results that as basis of comparison the average performance of their respective groups the day shift technician is the better worker, since his performance is 1 standard deviation slower than his group’s mean while the night shift technician’s performance is 3.8 is lower than his group’s mean.

Standard Scores or Z-Scores Formula

A Z-score measures the distance of an observed value from the mean per one standard deviation.

Sample:


Population:


Where:
= value of an ith observation
= sample mean
= population mean
= sample standard deviation
= population standard deviation

←Older