35 lines
1.9 KiB
Plaintext
35 lines
1.9 KiB
Plaintext
[[Statistics|Statistics]] is all about large groups of numbers.
|
|
When talking about a set of sampled data, most frequently used is their [[wp:Mean|mean value]] and [[wp:Standard_deviation|standard deviation (stddev)]].
|
|
If you have set of data <math>x_i</math> where <math>i = 1, 2, \ldots, n\,\!</math>, the mean is <math>\bar{x}\equiv {1\over n}\sum_i x_i</math>, while the stddev is <math>\sigma\equiv\sqrt{{1\over n}\sum_i \left(x_i - \bar x \right)^2}</math>.
|
|
|
|
When examining a large quantity of data, one often uses a [[wp:Histogram|histogram]], which shows the counts of data samples falling into a prechosen set of intervals (or bins).
|
|
When plotted, often as bar graphs, it visually indicates how often each data value occurs.
|
|
|
|
'''Task''' Using your language's random number routine, generate real numbers in the range of [0, 1]. It doesn't matter if you chose to use open or closed range.
|
|
Create 100 of such numbers (i.e. sample size 100) and calculate their mean and stddev.
|
|
Do so for sample size of 1,000 and 10,000, maybe even higher if you feel like.
|
|
Show a histogram of any of these sets.
|
|
Do you notice some patterns about the standard deviation?
|
|
|
|
'''Extra''' Sometimes so much data need to be processed that it's impossible to keep all of them at once. Can you calculate the mean, stddev and histogram of a trillion numbers? (You don't really need to do a trillion numbers, just show how it can be done.)
|
|
|
|
;Hint:
|
|
For a finite population with equal probabilities at all points, one can derive:
|
|
|
|
:<math>\overline{(x - \overline{x})^2} = \overline{x^2} - \overline{x}^2</math>
|
|
|
|
Or, more verbosely:
|
|
|
|
:<math>
|
|
\frac{1}{N}\sum_{i=1}^N(x_i-\overline{x})^2 = \frac{1}{N} \left(\sum_{i=1}^N x_i^2\right) - \overline{x}^2.
|
|
</math>
|
|
|
|
{{task heading|See also}}
|
|
|
|
* [[Statistics/Normal_distribution|Statistics/Normal distribution]]
|
|
|
|
{{Related tasks/Statistical measures}}
|
|
|
|
<br><hr>
|
|
|