Discrete random variable A random variable for which a list of all possible values could be made
Probability distribution A list or table showing the probability of each value occurring
The sum of the probabilities in the probability distribution equals
Probability function A function which provides for all
Cumulative distribution function A function which provides for all
Mean (The expectation of )
Variance
In order to calculate the succession of values of :
Two or more independent Poisson distributions can be combined as follows
If then
This also shows that if then
Example
At a checkpoint an average of cars pass per hour and the mean time between lorries is minutes.
Find the probability that exactly vehicles pass the checkpoint in a minute period.
cars per hour cars per minute
lorry per minutes lorries per minutevehicles per minute
Questions on the poisson distribution can include the use of the binomial theorem.
Example (Following from the above example)
What is the probability that exactly cars pass the checkpoint in at least or the next minutes?
Probability of success = ,
Mean = Variance =
Continuous random variable A variable which can take an infinite number of possible values
Example
Find for the following probability density function
The function must be integrated in sections
The first section is a quadratic. If then
Using the above formula,
The second section is linear. If then
is then given by the piecewise function
A rectangular distribution is given by
and
Given the mean and the variance of a rectangular distribution, and can be found by solving simultaneously.
For a percent confidence interval for the mean, across a number of different samples percent of the confidence intervals for the mean will contain the true population mean.
Confidence intervals should be written to a high degree of accuracy in the format (lower limit, upper limit)
If the population variance is given for a normally distributed population the Z tables are used and the confidence interval is given by:
If the population variance is unknown, but the sample size is greater than 30 the tables are used due to central limit theorem. The above formula is still used, with the following assumptions:
If the population variance is unknown, and the sample size is less than , the tables are used with degrees of freedom. The confidence interval is then given by:
Example
bottles are selected from a production line. The volume of liquid in each is recorded as ml
Stating any assumptions made, construct a confidence interval for the mean.
Using degrees of freedom, the values is
The confidence interval is then given by
The assumptions made were that
- The contents of the bottles are normally distributed
- The sample was selected randomly
State the null hypothesis and the alternate hypothesis.
Two tailed test
Choose the test statistic
For a known variance, or
For an unknown variance, and
Use tables to find the critical value
If a distribution is used, there are degrees of freedom
State the critical value, or draw a graph and mark it with the critical value and the test statistic
Conclude the hypothesis test
As [condition] there is / isn’t significant evidence at the level that the mean differs from a.
We therefore reject/accept and conclude that… [context].
This type of error occurs when H0 is reject, and H1 is accepted when H0 is in fact correct.
The probability of obtaining a type 1 error is the level of significance of the test hypothesis.
This type of error occurs when is accepted, when it is in fact false.
The probability of a type 2 error is not fixed, since it depends upon the extent to which the value of deviates from the value given in . If the value of is closed to the value given in the probability of a type 2 error is large.
For a table of values, the expected frequency of each value is
If the expected frequency of a particular value is less than , its row or column must be merged.
For the general case
For the case where the table is we have only one degree of freedom, and the test statistic is
The degrees of freedom, , of a chi-squared contingency table is given by where is the number of groups in the table.
The critical value is then found from the table or from a calculator.
Example of the comparison of two variables
Natives of England, Africa, and China were classified by blood group
O A B AB English 235 212 79 83 African 147 106 30 51 Chinese 162 135 52 43 Is there any evidence at the level that there is a connection between blood group and nationality?
: There is no connection between blood group and nationality
: There is a connection between blood group and nationalityFor each cell we find the expected value by multiplying the row total and column total before dividing by the table total
24.816 206.65 73.44 80.74 136.10 113.33 40.28 44.28 159.74 133.02 47.27 51.97 We have that degrees of freedom and then that
From the table of expected frequencies
As we do not reject at the level and therefore conclude that there is no connection between nationality and blood group.