The hypergeometric distribution is similar to the binomial distribution in that both describe the number of times a particular event occurs in a fixed number of trials. The difference is that binomial distribution trials are independent, whereas hypergeometric distribution trials change the probability for each subsequent trial and are called trials without replacement. For example, suppose a box of manufactured parts is known to contain some defective parts. You choose apart from the box, find it is defective, and remove the part from the box. If you choose another part from the box, the probability that it is defective is somewhat lower than for the first part because you have removed a defective part. If you had replaced the defective part, the probabilities would have remained the same, and the process would have satisfied the conditions for a binomial distribution.
The three conditions underlying the hypergeometric distribution are:
The mathematical constructs for the hypergeometric distribution are as follows:
The number of items in the population (N), trials sampled (n), and number of items in the population that have the successful trait (Nx) are the distributional parameters. The number of successful trials is denoted x.
Input requirements:
Population ≥ 2 and integer
Trials > 0 and integer
Successes > 0 and integer
Population > Successes
Trials < Population
Population < 1750
To reiterate, for a hypergeometric distribution:
Example: Of a group of 20 Ph.Ds. in Statistics, we know that 5 of them are highly competent and the others had rich parents who donated to the school heavily and are incompetent. What is the probability that of 10 randomly selected, 3 are highly competent?