6.5: Estimating Population Proportion

OpenStax
OpenStax

SECTION OBJECTIVES

Calculate and interpret confidence intervals for estimating a population proportion.
Calculate the sample size required to estimate a population proportion given a desired confidence level and margin of error.

During an election year, we see articles in the newspaper that state confidence intervals in terms of proportions or percentages. For example, a poll for a particular candidate running for president might show that the candidate has 40% of the vote within three percentage points (if the sample is large enough). Often, election polls are calculated with 95% confidence, so, the pollsters would be 95% confident that the true proportion of voters who favored the candidate would be between 0.37 and 0.43: (0.40 – 0.03,0.40 + 0.03).

Investors in the stock market are interested in the true proportion of stocks that go up and down each week. Businesses that sell personal computers are interested in the proportion of households in the United States that own personal computers. Confidence intervals can be calculated for the true proportion of stocks that go up or down each week and for the true proportion of households in the United States that own personal computers.

The procedure to find the confidence interval, the sample size, the error bound, and the confidence level for a proportion is similar to that for the population mean, but the formulas are different. How do you know you are dealing with a proportion problem? There is no mention of a mean or average!

To form a proportion, take \(X\), the random variable for the number of successes and divide it by \(n\), the number of trials (or the sample size). The random variable \(\hat \) (read "P hat") is that proportion,

When \(n\) is large and \(p\) is not close to zero or one, we can use the normal distribution to approximate the number of successes.

\[X \sim N(np, \sqrt)\nonumber \]

If we divide the random variable, the mean, and the standard deviation by \(n\), we get a normal distribution of proportions with \(P′ \), called the estimated proportion, as the random variable. (Recall that a proportion as the number of successes divided by \(n\).)

Using algebra to simplify:

\(\hat\) follows a normal distribution for proportions:

Calculating the Error Bound

The error bound (EBP) for a proportion is

This formula is similar to the error bound formula for a mean, except that the "appropriate standard deviation" is different. For a mean, when the population standard deviation is known, the appropriate standard deviation that we use is \(\dfrac>\). For a proportion, the appropriate standard deviation is

However, in the error bound formula, we use

as the standard deviation, instead of

In the error bound formula, the sample proportions \(\hat

\) and \(\hat\) are estimates of the unknown population proportions p and q. The estimated proportions \(\hat

\) and \(\hat\) are used because \(p\) and \(q\) are not known. The sample proportions \(\hat

\) and \(\hat\) are calculated from the data: \(\hat

\) is the estimated proportion of successes, and \(\hat\) is the estimated proportion of failures.

Which Calculator to Use?

There are many online calculators that can be used to compute the Margin of Error. For example, you can use this one:

Also, you are encouraged to ask your instructor about which calculator is allowed/recommended for this course.

Example \(\PageIndex\)

Use the calculator provided above to verify the following statements:

When \(\alpha=0.1, n=200, \hat

=0.43\) the EBP is \(0.0577\)

When \(\alpha=0.05, n=100, \hat

=0.81\) the EBP is \(0.0768\)

When \(\alpha=0.01, n=250, \hat

=0.57\) the EBP is \(0.0806\)

Exercise \(\PageIndex\)

Find EBP when \(\alpha=0.07, n=168, \hat

=0.82\).

Answer

Interactive Exercise \(\PageIndex\)

Constructing the Confidence Interval

The confidence interval has the form

\[(\hat

– EBP, \hat

+ EBP).\nonumber \]

\(EBP\) is error bound for the proportion.
\(\hat
= \dfrac\)
\(\hat
=\) the estimated proportion of successes (\hat
is a point estimate for p, the true proportion.)
\(x =\) the number of successes
\(n =\) the size of the sample

The confidence interval can be used only if the number of successes \(n\hat

\) and the number of failures \(n\hat\) are both greater than five.

The graph gives a picture of the entire situation.

\[CL + \dfrac + \dfrac = CL + \alpha = 1.\nonumber \]

This is a normal distribution curve. The peak of the curve coincides with the point x-bar on the horizontal axis. The points x-bar - EBM and x-bar + EBM are labeled on the axis. Vertical lines are drawn from these points to the curve, and the region between the lines is shaded. The shaded region has area equal to 1 - a and represents the confidence level. Each unshaded tail <a href= has area a/2." />

Which Calculator to Use?

There are many online calculators that can be used to compute the confidence intervals. For example, you can use this one:

Also, you are encouraged to ask your instructor about which calculator is allowed/recommended for this course.

Example \(\PageIndex\)

Use the calculator provided above to verify the following:

Confidence Level (%): \(95\)

Sample Size: \(197\)

Number of Successes: \(61\)

95% Confidence Interval: \((0.2450, 0.3742)\)

Exercise \(\PageIndex\)

Find a 90% confidence interval when the sample size is 250 and the number of successes is 85.

Answer

Interactive Exercise \(\PageIndex\)

Writing the Interpretation

The interpretation should clearly state the confidence level (\(CL\)), explain what population parameter is being estimated (here, a population proportion), and state the confidence interval (both endpoints). "We estimate with ___% confidence that the true population proportoin (include the context of the problem) is between ___ and ___ ."

Example \(\PageIndex\)

Suppose that a market research firm is hired to estimate the percent of adults living in a large city who have cell phones. Five hundred randomly selected adult residents in this city are surveyed to determine whether they have cell phones. Of the 500 people surveyed, 421 responded yes - they own cell phones. Using a 95% confidence level, compute a confidence interval estimate for the true proportion of adult residents of this city who have cell phones.

Solution

To calculate the confidence interval, you must find \(\hat

\), \(\hat\), and \(EBP\).

\(n = 500\)
\(x =\) the number of successes \(= 421\)

\(\hat
= 0.842\) is the sample proportion; this is the point estimate of the population proportion.

\[\hat = 1 – \hat

= 1 – 0.842 = 0.158\nonumber \]

Since \(CL = 0.95\), then

\[\alpha = 1 – CL = 1 – 0.95 = 0.05\left(\dfrac\right) = 0.025.\nonumber \]

Use the TI-83, 83+, or 84+ calculator command invNorm(0.975,0,1) to find \(z_\). Remember that the area to the right of \(z_\) is \(0.025\) and the area to the left of \(z_\) is \(0.975\). This can also be found using appropriate commands on other calculators, using a computer, or using a Standard Normal probability table.

\[\hat

– EBP = 0.842 – 0.032 = 0.81\nonumber \]

\[\hat

+ EBP = 0.842 + 0.032 = 0.874\nonumber \]

The confidence interval for the true binomial population proportion is \((\hat

– EBP, \hat

+EBP) = (0.810, 0.874)\).

Interpretation: We estimate with 95% confidence that between 81% and 87.4% of all adult residents of this city have cell phones.

Explanation of 95% Confidence Level: Ninety-five percent of the confidence intervals constructed in this way would contain the true value for the population proportion of all adult residents of this city who have cell phones.

Exercise \(\PageIndex\)

Suppose 250 randomly selected people are surveyed to determine if they own a tablet. Of the 250 surveyed, 98 reported owning a tablet. Using a 95% confidence level, compute a confidence interval estimate for the true proportion of people who own tablets.

Answer