Laboratory work. The Chi-Square Statistic
Section Testing Statistical Hypotheses. Topic The Chi-Square Statistic.
The Chi-Square test is used to compare the distribution laws of two independent statistical aggregates. A short test for you.
Question 1. Statistical hypotheses are divided into the following types… The answers are given. Choose the correct answer.
Question 2. The Chi-Square test is used to compare sampled populations… Possible answers: independent, dependent, random, non-random. Choose the correct answer.
Question 3. The first kind of error is that there will be… The answers are given. Choose the correct answer.
Let us check your answers. Question 1. The correct answer is 4. Question 2. The correct answer is 1. Question 3. The correct answer is 2.
Let us consider the following example. The students’ knowledge is checked using a test system for admission to laboratory class. The test results were evaluated on a nominal scale: “admitted” – 1, “not admitted” – 0. The purpose of the study is to compare the readiness of students of A and B faculties, respectively.
The solution. We formulate a null hypothesis – “the level of the students’ training of A and B faculties is the same”. An alternative hypothesis is “the level of the students’ training of A and B faculties is different.”
We choose the significance level of 0.05. We present the obtained data in the form of a table. Let us count the Total column. We remember it. Keep in mind that all the cell values are greater than 10. Therefore, we can use the formula shown on the screen.
The calculation is performed in MS Excel by substituting values into the corresponding formula. The empirical value found by this formula is 1.496. To find the critical value of the criterion, go to the Functions (Statistical category) and select the CHI2.OBR.PX function.
Since we took the reliability level of 0.05, we enter 0.05 in the Probability window. The degree of freedom is equal to the product of the number of rows -1 by the number of columns -1. We have a 2 by 2 table. The degree of freedom is 1. We got an approximate value of 3.841.
Now let us make a comparison. Due to the fact that the empirical value is less than the critical one (1.496 is less than 3.841), the null hypothesis is accepted.
The conclusion. Thus, the students’ training of A and B faculties does not differ from each other at a significance level equal to 0.05 or with a 95% reliability.
The example. The students’ opinion on the content of guidelines for laboratory class in one of the academic disciplines is evaluated. The opinion of the full-time and correspondence students is compared: “the methodological guidelines need to be improved in order to improve their content.” The answer “Yes” is 1, the answer “No” is 0.
The null hypothesis is “the opinions of full-time and correspondence students about the need to improve the guidelines coincide.” An alternative hypothesis is “the students’ opinions about the need to improve the guidelines do not coincide.”
Here, we select the significance level of 0.05. Consider how the data table looks like. Here, we see that the value in the first column, in the first row is 7. It is less than 10, so we use a different formula, it is presented on the screen. The calculation is performed in MS Excel. The links to the cells are provided here. It means that the trace remained.
The empirical value for our task is 5.149. We find the critical value of the criterion function. We use the category function Statistical CHI2. OBR. PX. We took the significance level of 0.05, so we enter this value in the Probability cell. The degree of freedom is found using the same formula: the number of rows minus 1 is multiplied by the number of rows minus 1.
The table has a 2 by 2 dimension, so the degree of freedom is 1. The function value is approximately 3.841. Now let us make a comparison. Since 5.149 is greater than 3.841 (the empirical value is greater than the critical one), we can conclude that the null hypothesis is rejected, and an alternative hypothesis can be accepted.
To make sure that the alternative hypothesis can be accepted, we perform an additional check. In the meantime, we can conclude that the students’ opinion about the need to improve the guidelines differs at the significance level of 0.05, i.e. with a 5% error, and with 95% reliability.
To make sure that the alternative hypothesis is accepted, we estimate the probability ratio of P1 and P2, and calculate the frequencies. We see that the ratio of Q11 to N1 is 0.24, and the ratio of Q2 to N2 is 0.57. Obviously, 0.57 is greater than 0.24. Therefore, we can conclude that P2 is greater than P1.
Thus, the correspondence students have more difficulty than the full-time students in understanding the meaning of these instructions and, therefore, more often express their opinion about the need to improve the guidelines.
I have some tasks for you to solve. The first task is presented on the slide. Here is a table of data. Task two. The original table is not given, you need to create it on your own, taking into account the data from the condition, then look for the answer to the problem question. I wish you would find a successful solution. Thank you for your attention.