Laboratory work (video). Basic concepts of mathematical statistics
Section: Fundamentals of mathematical statistics. Topic: Basic concepts of mathematical statistics.
A short test for you. Question 1. The entire population of objects that characterizes the statistical feature under study is called ... Answer choice: 1) a sample population; 2) a general population; 3) a random variable; 4) a population. Choose the correct answer. Question 2. The sample population is – … Answer choice: 1) a randomly selected part of the general population; 2) a set of objects that characterizes the statistical feature under study; 3) all the values of a random variable; 4) a set of numbers. Choose the correct answer. Question 3. Types of graphs for a sample population ... Answer choice: 1) a polygon; 2) a histogram; 3) a cumulative curve; 4) all the variants listed above. Choose the correct answer.
Let’s check your answers: question 1 – the correct answer is 2, question 2 – the correct answer is 1, question 3 – the correct answer is 4, and here you can add another graph that can be used for the sample population – this is a percentile curve.
A sample distribution function can be built in Microsoft Excel using FREQUENCY function in the Statistical in the Function Wizard (data array, interval array), which allows us to calculate the frequency of occurrence of a random variable in intervals of values and outputs these frequencies as an array of data.
A graphical representation of the sample population can be performed using the Data Analysis Package and the Histogram Procedure. The parameters of the histogram: Input Interval – a range of the data under study; Pocket Interval – a range of intervals; Output Interval; Integral Percentage; Graph Output.
Let’s consider the following example. The task is to construct an empirical distribution of students’ weight in kilograms for the following sample: 64, 57, 63, 62, 58, 61, 63, 70, 60, 61, 65, 62, 62, 40, 64, 61, 59, 59, 63, 61. I suggest entering this data in Microsoft Excel, the range of intervals – let’s take the interval from 40 to 70 since the smallest value is 40, and the maximum value is 70, with a step of 5. We enter these data. Now we will find the absolute frequencies. To do this, we select C2, go to the Function Wizard, select the function of the Statistical category, then select FREQUENCY. We select an array of data – these are the data from the first observation column, and an array of intervals – the values from B2 to B8. We obtained the absolute frequency for the first interval. Now we select the range where our frequencies will be displayed. We activate F2, that is, activate the Frequency function, then press three keys Ctrl, Shift, Enter simultaneously. The values received are our absolute frequencies. Now let’s check whether the calculations are correct. To do this, we select Toolbar and click on AutoSum. The resulting value is 20, indeed, the range of our data is 20. Now let’s plot the relative frequencies. To do this, we select D2, take the absolute frequency value of C2 and divide it by C9. We record it. The resulting first value is our first relative frequency. Then we stretch. Accordingly, we can also check it. To do this, click AutoSum in the Toolbar, and indeed, we see that the sum relative to the frequency is one.
Now let’s calculate the accumulated frequencies. To do this, we copy the first value of T2 to E2, and then we sum the value of E2 with the value of D3, then we stretch. The accumulated frequencies will eventually be equal to one, which means that we have calculated everything correctly.
Now let’s build graphical representations of our sample population. To do this, we go to the Toolbar – Insert – Charts. And so, the first chart is a polygon. Please note that we can fill in the name of the axes, the name of the charts, we can also enter the data, the margin of error, and so on. We can change the chart type, for example, select a histogram. Similarly, we can enter the name of the axes both horizontally and vertically, and, of course, the name of the chart.
Now in the Toolbar, we select Data, Data Analysis Package, Histogram. The input interval is the first column of data, the pocket interval is column 2, here we put a check mark in the label because there are headers for each of the columns, and let’s put the output interval O1. We got the corresponding frequencies. This is the second method for calculating frequencies for the selected intervals. To display the chart, of course, do not forget to check the box Output Chart. Please note that the resulting histogram is absolutely identical to the histogram constructed using chart insertion, only here we already have the axes and the chart name given.
It is also possible to construct a chart of relative and accumulated frequencies. To do this, we also go to Insert – Charts. Here you must choose accordingly a range of other data, that is, if we take the relative frequencies, it’s a range from D2 to D8, if you want to take the accumulated frequencies, these are E2 and E8. So to do this, we select the data range for the chart E1, E8, then we change the range of axes B2, B8 and change the number, that is, the values D2, D8, then we click OK. So, we get the following chart (see the video), where the blue graph is a graph of the accumulated frequencies, and the red graph is a graph related to relative frequencies. If we want to change the graph relative to frequencies, here it is represented as a polygon, and we want to represent it as a histogram, we right-click and select Change Chart Type. By changing the chart type, we select histograms, then, respectively, we have the histogram corresponding to relative frequencies instead of the polygon.
Here are some tasks for you to solve.
Task 1. In an elementary school, in one of the grades, the following data were obtained on the weight of boys: 21.8; 24; 21.8; 20; 21.8; 20; 19.3; 20.8; 19.3; 23.8; 20; 24; 20; 19.3; 20; 21.8; 20.8; 20.8; 23.8; 23.8; 20; 20.8; 24; 24.5; 23.8; 23.8; 20.8; 20; 24; 24; 24.5; 19.3; 20.8; 20; 20.8; 24.5; 20; 20. The task is to build the sample distribution function, to build a polygon.
Task 2. The results of measuring the height of randomly selected 100 students are presented in the form of a data table. Height in cm.: 154-158, 158-162, 162-166, 166-170, 170-174, 174-178, 178-182. The number of students: 10, 14, 26, 28, 12, 8, 2, respectively. The task is to build a histogram.
I wish you would find a successful solution. Thank you for your attention!