Lecture. Testing statistical hypotheses
Lecture "Testing statistical hypotheses".
Let 's consider testing the statistical hypothesis about the homogeneity of two sample populations.
The first case is when the volume of both sample sets does not exceed 25. Here is an algorithm that must be performed in order to make the appropriate check for uniformity. Here you choose the level of significance, formulate hypotheses – the null and alternative ones, find the empirical test value, find in the Wilcoxon table the critical lower border and by the formula - the upper critical. Further, if our empirical value is between the lower critical point and the upper one, then there is no reason to reject the null hypothesis. Otherwise, the null hypothesis is rejected.
Let's consider an example. At a significance level of 0.05, test the hypothesis of homogeneity of two sample populations. Here, the values of the first sample set and the second one are highlighted in the corresponding color, and we will take this into account when solving.
Here the significance level is 0.05. Hypotheses are formulated H0 and H1. The second step is to arrange the variants of both sample sets in ascending order, in the form of this variation series, and find the sum of the ordinal numbers of the variant of the first sample set. Here, the first sample set is highlighted in the appropriate color. We will summarize the corresponding sequence numbers of the variant. Thus, we obtain an empirical value of the criterion equal to 41.
Let's find the lower critical point from the Wilcoxon table. Here we have a significance level of 0.05 divided by 2 and we select column 0.025. The volume of the first sample is 6, the second-7, respectively, the lower limit is 27.
The lower limit is 27. We substitute it into the formula and find the upper critical point – this is 57. The empirical value of 41 falls in the range from 27 to 57, and thus we can conclude that there is no reason to reject the null hypothesis about the homogeneity of sample populations.
Now consider the second case (the volume of at least one sample population exceeds 25). The algorithm here consists of the same steps. Choose the level of significance . Hypotheses formulate H0, H1. We also find the empirical value of the criterion as the sum of the ordinal numbers of the variant of the first sample population. And find the lower and upper critical points.
The lower critical point is found by the formula (here it is denoted by*). Zcrit is found either by the table of values of Laplace function, or by the statistical function NORM. ST. OBR. The fourth step is to compare the empirical value with the lower and upper critical points. If we have an empirical value that falls between the lower and upper critical points, then there is no reason to reject the null hypothesis. Otherwise, the null hypothesis is rejected.
Let's consider an example. At a significance level of 0.05, test the hypothesis of homogeneity of two sample sets of volumes 40 and 50. Here the empirical value is given, it is equal to 1650.
The significance level according to the condition is 0.05. Hypotheses H0, H1 are presented. Here the empirical value of the criterion by condition is given, and we will now find the lower critical point. To do this, we find the Zcrit using the statistical function NORM. ST. OBR.
Note that here, in the "Probability" window, we enter the value of the confidence probability, i.e., we subtract the significance level from one, divide it by two, and add another 0.5. Our critical value is approximately 1.96.
Find the lower critical point by the formula. Here, square brackets mean that we should round our value. And the lower critical point (and rounded to an integer) is 1578. We substitute it into the formula and find the upper critical point. Here we have 2062. The empirical value by condition is 1650. We get that the empirical value falls in the range between the lower critical point (1578) and the upper critical point (2062). Thus, there is no reason to reject the null hypothesis of homogeneity of sample populations.
Testing the statistical hypothesis of comparing the average of general populations with known variances.
The algorithm is presented here. Also we choose the level of significance. We formulate hypotheses. We find the empirical value of the criterion by the formula
We find the critical value of the criterion using the statistical function NORM. ST. OBR. And compare the empirical and critical values of the criterion. Here, I draw your attention to the fact that in the laboratory work, this criterion for comparing the average of the general population is considered and examples are given there. So I don't offer an example here.
And we move on to the next test of the statistical hypothesis of comparing the average of general populations with unknown variances.
In this case, it is necessary to consider independent sets and dependent ones. Again, I want to warn you right away that no examples are given here, because the laboratory work on the corresponding criteria for comparing the average of general populations for unknown variances is recorded, where examples for each algorithm are given. Here the algorithm is described step by step. Also choose the level of significance. Hypotheses H0, H1 are formulated. We find the empirical value by the formula.
We also find the critical value. Here you can use the statistical function STUDENT.OBR.2X for this purpose. And then it is necessary to compare the empirical and critical values of the criterion.
For dependent sets, the algorithm here also begins by stating the hypotheses H0, H1, which are the same as in the previous algorithm. You can choose the level of significance here yourself. This is either 0.05, or 0.01, or 0.1. The second step is to find the empirical value of the criterion. Here, note that the denominator in this formula will also have to be found by the corresponding additional formula, which is presented in step 3.
The fourth step is to find the critical value of the STUDENT criterion using the statistical function STUDENT.OBR.2X. And in the last step, we have to compare the empirical and critical values of the STUDENT criterion.
Testing the statistical hypothesis about the significance of the correlation coefficient.
The algorithm.
The first step. We select the significance level and formulate a hypothesis. Here, the null hypothesis is the hypothesis that the general correlation coefficient is equal to zero. The opposite or competing hypothesis H1 is about the inequality of the general correlation coefficient to zero.
The second step. We find the empirical value of the criterion by the formula.
In the third step, we find the critical value of the criterion, either using the student table or using the STUDENT.OBR.2X statistical function.
And the fourth step is to compare the empirical and critical values. If we have the empirical module less than the critical one, then there is no reason to reject the null hypothesis. Otherwise, the null hypothesis is rejected.
Let's consider an example. From a two-dimensional general population, a sample population of 100 was extracted and a sample correlation coefficient of 0.2 was found.at a significance level of 0.05. Test the hypothesis of the significance of the sample correlation coefficient.
The first step. The significance level by the condition is 0.05. We formulate hypotheses H0, H1 in accordance with the algorithm.
Find (the next step) the empirical value of the criterion. Substitute 100 in the formula instead of n, the sample correlation value of 0.2 is also substituted and we get our value of 2.02.
The next step. To find out the critical criterion value we use statistical functions STUDENT.OBR.2. Our value is here, the probability is the significance level of 0.05, and Degree of freedom here is n-2. N in our case is 100, so subtract 100 from 2 and get Degree of freedom 98. Let's approximate the critical value of 1.99.
And the last step. We found the empirical value, and it is 2.02. The critical value is 1.99. The empirical value is greater than the critical value. Then we reject the null hypothesis that the general correlation coefficient is equal to zero. Therefore, we can conclude that the correlation coefficient is significantly different from zero. That is, random variables X and Y are correlated.
I wish you successful mastering of the material. Have a nice day."