Discrete distributions deal with countable outcomes (e.g., number of heads in coin flips, number of defective items).
1. Binomial Distribution
Meaning: The binomial distribution calculates the probability of getting exactly k successes in n independent Bernoulli trials (experiments with only two outcomes: success or failure), where the probability of success in each trial is p.
Formula: P(X = k) = C(n, k) * p^k * (1-p)^(n-k)
Where:
X: Random variable representing the number of successes.
k: Number of successes we want to find the probability for.
n: Number of trials.
p: Probability of success in a single trial.
C(n, k): The binomial coefficient, calculated as n! / (k! * (n-k)!), representing the number of ways to choose k successes from n trials.
Example: What is the probability of getting exactly 3 heads in 5 coin flips? (p = 0.5 for a fair coin)
Meaning: The Poisson distribution calculates the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known average rate (λ) and independently of the time since the last event.
Formula: P(X = k) = (e^(-λ) * λ^k) / k!
Where:
X: Random variable representing the number of events.
k: Number of events we want to find the probability for.
λ (lambda): Average rate of events (mean).
e: Euler's number (approximately 2.71828).
Example: If the average number of customers arriving at a store in an hour is 10, what is the probability that exactly 15 customers will arrive in an hour?
λ = 10, k = 15
P(X = 15) = (e^(-10) * 10^15) / 15! ≈ 0.0347
Continuous Probability Distributions
Continuous distributions deal with outcomes that can take on any value within a range (e.g., height, weight, temperature).
3. Normal Distribution
Meaning: The normal distribution is a bell-shaped, symmetrical distribution that is very common in many natural phenomena. It's defined by its mean (μ) and standard deviation (σ).
Probability Calculation: Calculating the exact probability for a specific value is difficult with continuous distributions. Instead, we calculate the probability of a value falling within a certain range. This is done using the standard normal distribution (Z-distribution) and Z-scores.
Z-score: A Z-score tells you how many standard deviations a value is away from the mean.
Formula: Z = (X - μ) / σ
Using the Z-table: Once you have the Z-score, you can look up the corresponding probability in a Z-table (or use statistical software). The Z-table gives the probability of a value being less than the given Z-score.
Example: Suppose the average height of adult women is 5'4" (64 inches) with a standard deviation of 2 inches. What is the probability that a randomly selected woman is between 5'2" (62 inches) and 5'6" (66 inches) tall?
Let's break down hypothesis testing, including formulating hypotheses and performing Z-tests for population means and proportions.
Hypothesis Testing
Hypothesis testing is a formal procedure for deciding between two competing claims about a population parameter (e.g., mean, proportion).
1. Formulating Null and Alternative Hypotheses
Null Hypothesis (H₀): A statement of no effect or no difference. It's the statement we're trying to disprove. It often represents the status quo.
Alternative Hypothesis (H₁ or Ha): A statement that contradicts the null hypothesis. It's what we're trying to prove. It often represents the researcher's belief or theory.
Hypotheses are always about population parameters, not sample statistics.
Example: Suppose we want to test if the average height of women is different from 5'4" (64 inches).
H₀: μ = 64 (The average height is 64 inches.)
H₁: μ ≠ 64 (The average height is not 64 inches.) This is a two-tailed test. We could also have a one-tailed test: H₁: μ > 64 or H₁: μ < 64.
2. Performing Z-tests on the Mean of a Population
A Z-test is used when we know the population standard deviation (σ) or when the sample size is large (typically n ≥ 30).
Steps:
State the hypotheses: As above.
Determine the significance level (α): This is the probability of rejecting the null hypothesis when it's actually true (Type I error). Common values are 0.05, 0.01, or 0.10.
Calculate the test statistic (Z-score):
Formula: Z = (x̄ - μ) / (σ / √n)
Where:
x̄: Sample mean
μ: Population mean (under H₀)
σ: Population standard deviation
n: Sample size
Determine the critical value(s) or p-value:
Critical Value Approach: Find the critical value(s) from the Z-distribution table based on α and the type of test (one-tailed or two-tailed). If the calculated Z-score falls in the rejection region (beyond the critical values), we reject H₀.
P-value Approach: Calculate the p-value, which is the probability of observing a test statistic as extreme as, or more extreme than, the one calculated, assuming H₀ is true. If the p-value is less than α, we reject H₀.
Make a decision: Reject H₀ if the test statistic falls in the rejection region (critical value approach) or if the p-value is less than α. Otherwise, fail to reject H₀.
State the conclusion: In the context of the problem.
Example: Suppose we have a sample of 50 women with a mean height of 65 inches. Assume the population standard deviation is 2.5 inches. Test if the average height of women is different from 64 inches (α = 0.05).
H₀: μ = 64, H₁: μ ≠ 64
α = 0.05
Z = (65 - 64) / (2.5 / √50) ≈ 2.83
Critical values: ±1.96 (for a two-tailed test at α = 0.05). Since 2.83 > 1.96, we reject H₀.
Conclusion: There is sufficient evidence to conclude that the average height of women is different from 64 inches.
3. Performing Z-tests on Population Proportion
A Z-test can also be used to test hypotheses about population proportions.
Steps: Similar to the mean test, but the formulas are different.
State the hypotheses: H₀: p = p₀, H₁: p ≠ p₀ (or one-tailed).
Determine the significance level (α).
Calculate the test statistic (Z-score):
Formula: Z = (p̂ - p₀) / √(p₀(1 - p₀) / n)
Where:
p̂: Sample proportion
p₀: Population proportion (under H₀)
n: Sample size
Determine the critical value(s) or p-value.
Make a decision.
State the conclusion.
Example: Suppose we want to test if more than 50% of voters support a particular candidate. We take a sample of 200 voters and find that 110 support the candidate. (α = 0.05).