Statistics Tutorial
Statistics Basics
data types in histogram
- histogram. in statistics or probability chart, the histogram, there are several interpretation of the char's column values and height. the different data sets may be: data sample, population, probability distribution.
- population (statistics) means the data set of all.
- sample (statistics) means a subset of population. (usually randomly selected)
- data set (statistics) means a set of numbers. e.g. [65 98 10 580]. Or each can be a pair, such as age and income e.g. [ [28, $77], [30, $40 ],[42, $27 ], [81, $33 ] ], or name and age e.g. [[jon, 65], [mary, 54], [joe, 565]]
- probability distribution is a set of pairs of numbers. usually a event lable, and its probability e.g. throw dice, each pair [total, probability].
Central tendency
these are numbers to represent some kinda of middle, common value, or average.
- Arithemetic mean. usually written as μ. is the average. i.e. sum of the values of each event, divided by total count of possible events.
- Median. middle value separating the greater and lesser halfs of a data. Median of [1, 3, 3, 6, 7, 8, 9] is 6. Median of [ 1, 2, 3, 4, 5, 6, 8, 9] is (4+5)/2. The data set must have an order. Used only for 1-dimensional data. Median
- mode. Most frequent value in a data set. e.g. 2 is the mode of [1, 2, 2, 3, 4, 7, 9]. Used only for 1-dimensional data.
Measure of Variation
- deviation (of an event) is the difference between the event value and the arithemetic mean. (if x is value of an event, m is mean, then deviation = x-m.) Deviation (statistics)
- Squared deviation from the mean (aka: SDM) (for an event) is square of the deviation of the event. (if x is value of an event, m is mean, then squared deviation = (x-m)^2.) Squared deviations from the mean
- Variance, is the average of SDM. variance is usually denoted σ^2. (if x is value of an event, n is number of events, m is mean, σ^2 = (Sum[(x - m )^2, {x,1,n}])/n.)
- Standard deviation (usually denoted σ), is square root of variance.
i.e.
Sqrt[ (Sum[(x - mean)^2, {x,1,n}]) /n ]
There is also Sample Standard deviation, which instead of divided by n, divide by (n-1). Standard deviation - the expected value is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a large number of independently selected outcomes of a random variable. Expected value
- confidence interval Confidence interval
BA 115 "STATISTICS" PRACTICE FINAL spring 2023
