- Base of statistics

- Central limit theorem

- Testing of hypothesis

- Time series analysis

- Probability and distribution

- Normal distribution

- Design of experiment

- Correlation and regression analysis

- Statistical quality control

- Index number

- Vital statistics

**Example : **. A company contemplating the introduction of a new product wants to estimate the percentage of the market that this new product might capture. In a survey, random samples of 100 customers were asked whether or not they would purchase this new product. Fourteen responded affirmatively. Calculate the 95% confidence interval for the population proportion of potential customers that would purchase the new product. Interpret the result.

Solution:

Random sample n= 100

14 responded affirmatively i.e. a person respond affirmatively is Po : 0.14

95% Confidence interval for the population proportion:

Applying the general formula for a confidence interval, the confidence interval for a proportion, π, is:

**p ± z σp**

Where; p is the proportion in the sample. Z is the value for 0.05 levels and σp is the standard error for the proportion.

So, Z at 95%C.I.: 1.94

And the 95%C.I. for proportion is: **(0.1213, 0.1586)**

**Example **.

A simple random sample of 25 boxes of candy is selected and the contents of each box are weighed. This leads to a sample mean of 16.3 ounces with a standard deviation of 3.8 ounces. Compute a 90% confidence interval for the mean weight of the boxes of candy. Interpret the result.

Solution:

Sample size n= 25 boxes. Sample mean = 16.3 and standard deviation σ =3.8.

For 90 % C.I for mean: and the value of Z at 90 % C.I Is: 1.65

So, the confidence interval is: (17.554, 15.046) so, the sample means lies between 17.554 and 15.046 with 90 % Confidence interval.

Thus, **(15.046, 17.554).**

Interpretation:

I.e. the population mean which is unknown, lies between these ranges for normal distribution.

** **

** **

**Example **.

A company contemplating the introduction of a new product wants to estimate the percentage of the market that this new product might capture. In a survey, random samples of 100 customers were asked whether or not they would purchase this new product. Fourteen responded affirmatively. Calculate the 95% confidence interval for the population proportion of potential customers that would purchase the new product. Interpret the result.

Solution:

Random sample n= 100

14 responded affirmatively i.e. a person respond affirmatively is Po : 0.14

95% Confidence interval for the population proportion:

Applying the general formula for a confidence interval, the confidence interval for a proportion, π, is:

**p ± z σp**

Where; p is the proportion in the sample. Z is the value for 0.05 levels and σp is the standard error for the proportion.

So, Z at 95%C.I.: 1.94

And the 95%C.I. for proportion is: **(0.1213, 0.1586)**

**Example **

A simple random sample of the body temperatures of 106 healthy humans were taken for which

Sum*x*=98.20o F and *s*=0.62oF .

- Determine the number of degrees of freedom for this sample size.

Ans: Here sample size n= 106 so, degree of freedom = n-1 = 105

- Construct a 99% confidence interval to estimate the mean body temperature of all healthy humans.

Solution: 99% C.I. =X bar ±S.E. of X bar

= 98.20 ± σ t.01

= 98.20 ± .62 /sqrt(106) *2.95

**=[ 98.0224 , 98.3776]**

**Example **

Suppose a student measuring the boiling temperature of a certain liquid observes the readings (in degrees Celsius) 102.5, 101.7, 103.1, 100.9, 100.5, and 102.2 on 6 different samples of the liquid. He calculates the sample mean to be 101.82. If he knows that the standard deviation for this procedure is 1.2 degrees, what is the confidence interval for the population mean at a 95% confidence level?

In other words, the student wishes to estimate the true mean boiling temperature of the liquid using the results of his measurements. If the measurements follow a normal distribution, then the sample mean will have the distribution

Since the sample size is 6, the standard deviation of the sample mean is equal to 1.2/sqrt(6) = 0.49.

The selection of a confidence level for an interval determines the probability that the confidence interval produced will contain the true parameter value. Common choices for the confidence level *C* are 0.90, 0.95, and 0.99. These levels correspond to percentages of the area of the normal density curve. For example, a 95% confidence interval covers 95% of the normal curve — the probability of observing a value outside of this area is less than 0.05. Because the normal curve is symmetric, half of the area is in the left tail of the curve, and the other half of the area is in the right tail of the curve. As shown in the diagram to the right, for a confidence interval with level *C*, the area in each tail of the curve is equal to (1-*C*)/2. For a 95% confidence interval, the area in each tail is equal to 0.05/2 = 0.025.

The value *z ^{*}* representing the point on the standard normal density curve such that the probability of observing a value greater than

For example, if *p* = 0.025, the value *z ^{*}* such that

**Example**

The dataset “Normal Body Temperature, Gender, and Heart Rate” contains 130 observations of body temperature, along with the gender of each individual and his or her heart rate. Using the MINITAB “DESCRIBE” command provides the following information:

Descriptive Statistics

Variable N Mean Median Tr Mean St Dev SE Mean

TEMP 130 98.249 98.300 98.253 0.733 0.064

Variable Min Max Q1 Q3

TEMP 96.300 100.800 97.800 98.700

To find a 95% confidence interval for the mean based on the sample mean 98.249 and sample standard deviation 0.733, first find the 0.025 critical value *t ^{*}* for 129 degrees of freedom. This value is approximately 1.962, the critical value for 100 degrees of freedom (found in Table E in Moore and McCabe). The estimated standard deviation for the sample mean is 0.733/sqrt(130) = 0.064, the value provided in the SE MEAN column of the MINITAB descriptive statistics. A 95% confidence interval, then, is approximately ((98.249 – 1.962*0.064), (98.249 + 1.962*0.064)) = (98.249 – 0.126, 98.249+ 0.126) = (98.123, 98.375).

For a more precise (and more simply achieved) result, the MINITAB “TINTERVAL” command, written as follows, gives an exact 95% confidence interval for 129 degrees of freedom:

MTB > tinterval 95 c1

Confidence Intervals

Variable N Mean St Dev SE Mean 95.0 % CI

TEMP 130 98.2492 0.7332 0.0643 ( 98.1220, 98.3765)

According to these results, the usual assumed normal body temperature of 98.6 degrees Fahrenheit is not within a 95% confidence interval for the mean.

**Example**

A sample of Alzheimer’s patients are tested to assess the amount of time in stage IV sleep. It has been hypothesized that individuals sufferering from Alzheimer’s Disease may spend less time per night in the deeper stages of sleep. Number of minutes spent is Stage IV sleep is recorded for sixty-one patients. The sample produced a mean of 48 minutes (S=14 minutes) of stage IV sleep over a 24 hour period of time. Compute a 95 percent confidence interval for this data. What does this information tell you about a particular individual’s (an Alzheimer’s patient) stage IV sleep?

The standard error of the mean is 1.807392228.

t=2.000

Confidence Interval at 95 percent: 43.5 < population mean < 52.5

We are 95 percent sure that the population mean for the number of hours an Alzheimer’s patient will spend in stage IV sleep in a 24 period of time is somewhere between 44.4 minutes and 51.6 minutes. There is a 5 percent chance than the population mean for stage IV sleep in Alzheimer’s patients is less than 44.4 minutes or more than 51.6 minutes.

A university wants to know more about the knowledge of students regarding international events. The are concerned that their students are uninformed in regards to new from other countries. A standardized test is used to assess students knowledge of world events (national reported mean=65, S=5). A sample of 30 students are tested (sample mean=58, Standard Error=3.2). Compute a 99 percent confidence interval based on this sample’s data. How do these students compare to the national sample?

t=2.756

Confidence Interval at the 99 percent: 49.2 < population mean < 66.8

While the data for these students are low in relation to the national scores, the constucted interval does include the national mean. Therefore, the university may be on par with other universities. The university may with to replicate their survey to further validate their results.

**Example **

A sample of students from an introductory psychology class was polled regarding the number of hours they spent studying for the last exam. All students anonymously submitted the number of hours on a 3 by 5 card. There were 24 individuals in the one section of the course polled. The data was used to make inferences regarding the other students taking the course. There data are below:

4.5 |
22 |
7 |
14.5 |
9 |
9 |
3.5 |
8 |
11 |
7.5 |
18 |
20 |

7.5 |
9 |
10.5 |
15 |
19 |
2.5 |
5 |
9 |
8.5 |
14 |
20 |
8 |

Compute a 95 percent confidence interval. What does this tell us?

Mean = 10.92

Standard Deviation = 5.598265777

Standard Error = 1.167319108

t = 2.069

Confidence Interval at 95 percent: 8.50 < population mean < 13.33

We are 95 percent sure that the actual population mean for the number of hours introductory psychology students studied for the last exam was somewhere between 8.5 hours and 13.33 hours. There is a 5 percent chance that the population mean does not lie within that interval.

**Example** A process is known to produce bricks whose weights are normally distributed with standard deviation 0.12 pounds. A random sample of sixteen bricks from today’s output had a mean weight of 4.07 pounds.

(a) Find a 99% confidence interval for the mean weight of all bricks produced today.

A confidence interval consists of three elements: a measure of central tendency, a number of standard errors, and some measure of dispersion (e.g. a standard error). In this case the center is 4.07 (our sample mean, which is the best estimate we have of central tendency). The number of standard deviations is 2.58 (found by looking in the z table where the probability is 0.495 – you might just as reasonably use 2.57 or 2.575).

In this case we have a sample from a normal distribution with a known standard deviation, so we can use Formula #1 (see the Confidence Interval Formulae sheet). The standard error is the basic standard error of the mean :

(b) Without doing the calculations, state whether a 95% confidence interval for the population mean would be wider than, narrower than, or the same width as that found in (a).

Narrower, because a larger alpha is associated with a narrower interval.

(c) It is decided that tomorrow a sample of twenty bricks will be taken. Without doing the calculations, state whether a correctly calculated 99% confidence interval for the mean weight of tomorrow’s output will be wider than, narrower than, or the same width as that found in (a).

Narrower, because a larger *n* makes the standard error smaller.

(d) In fact, the population standard deviation from today’s output is 0.15 pounds. Without doing the calculations, state whether a correctly calculated 99% confidence interval for the mean weight of today’s output will be wider than, narrower than, or the same width as that found in (a).

Wider, because our estimated standard error was based on a standard deviation of 0.12. A larger standard error is associated with a wider interval.

**Example : **A production manager knows that historically, the amounts of impurities in bags of a chemical follow a normal distribution with a standard deviation of 3.8 grams. A random sample of nine bags of the chemical yielded the following amounts of impurities in grams:

18.2 |
13.7 |
15.9 |
17.4 |
21.8 |
16.6 |
12.3 |
18.8 |
16.2 |

(a) Find a 90% confidence interval for the population mean weight of impurities.

First, calculate . Again we have a sample from a normal distribution with a known standard deviation, so we can use Formula #1.

(b) Without doing the calculations, state whether a 95% confidence interval for the population mean would be wider than, narrower than, or the same width as that found in (a).

Wider, because a smaller alpha is associated with a wider interval.

**Example :** A random sample of 1,562 undergraduates enrolled in marketing courses was asked to respond on a scale from one (strongly disagree) to seven (strongly agree) to the statement: “Most advertising insults the intelligence of the average customer.” The sample mean response was 3.92 and the sample standard deviation was 1.57.

(a) Find a 90% confidence interval for the population mean response.

In this case we have a large sample, so we can use Formula #1. The sample standard deviation *s* is used as a good estimate of the (unknown) population parameter *m*.

(b) Without doing the calculations, state whether an 80% confidence interval for the population mean would be wider than, narrower than, or the same as (a).

Narrower, because a larger alpha is associated with a narrower interval. (Remember that alpha is the area outside the confidence interval; as alpha gets bigger, the interval gets smaller.)

**Example : ** The Cloze readability procedure is designed to measure the effectiveness of a written communication. (A score of 57% or more on the Cloze test demonstrates adequate understanding of the written material.) A random sample of 352 certified public accountants was asked to read financial report messages. The sample mean Cloze score was 60.41% and the sample standard deviation was 11.28%. Find a 90% confidence interval for the population mean score, and comment on your result.

As was the case in the previous problem, the large sample size allows us to use Formula #1.

We can safely assume that the financial report messages are effectively written.

**Example : .** A population has a normal distribution with unknown mean and unknown variance. We know how to find a confidence interval for the population mean, given a random sample of two observations. We do not, however, know how to find such a confidence interval with a random sample of only one. Why not?

We have no way to estimate the variance. Not only does the denominator of the variance formula call for (*n* – 1), which, in this case would require us to divide by zero, but the *t* statistic is undefined with zero degrees of freedom.

**Example : ** In October 1992, ownership of the San Francisco Giants baseball team considered a sale of the franchise that would lead to a move to Florida. A random sample of 610 San Francisco Bay Area taxpayers, carried out by the *San Francisco Examiner*, contained 50.7% who would be disappointed by this move. Find a 99% confidence interval for the population proportion of Bay Area taxpayers with this feeling.

This is a proportion problem with a large sample; we will use Formula #3:

**Example :. **A random sample was taken of 189 National Basketball Association games in which the score was not tied after one quarter. In 132 of these games, the team leading after one quarter won the game.

(a) Find a 90% confidence interval for the population proportion of all occasions on which the team leading after one quarter wins the game.

Another proportion problem with a large sample. Note that .

“Of all the games that are not tied after one quarter, we are 90% sure that the proportion of games that are eventually won by the team leading after the first quarter is somewhere between 64.35% and 75.33%.”

(b) Without doing the calculations, state whether a 95% confidence interval for the population proportion would be wider than or narrower than that found in (a).

Wider, because a smaller alpha is associated with a wider interval.

**Example : .** Of a random sample of 323 union members, 47.9% agreed or strongly agreed with the statement: ” Union workers should refuse to work when a nonunion worker is sent to the job.” Based on this information, a statistician calculated, for the percentage of all union members with this view, a confidence interval running from 45.8% to 50.0%. Find the level of confidence associated with this interval.

This is like the previous two problems, but we are working backwards. Instead of being given a desired confidence level and having to find the upper and lower limits of the interval, we are doing the reverse.

Note that the upper and lower boundaries of the confidence interval are 2.1% (or 0.021) away from the estimated proportion of .479. This means that:

The confidence level of this interval is 55.28%.

**Example : **. A random sample was taken of 96 foreign manufacturers, with direct investment in the United States, who use independent U.S. industrial distributors. Of these sample members, 32 said the distributors were rarely or never capable of performing the advice and technical support function. Find an 80% confidence interval for the population proportion.

Another problem for Formula #3. Note that .

**Example : ** For a random sample of 40 accounting students in a class using group learning techniques, the mean examination score was 322.12, and the sample standard deviation was 54.53. For an independent random sample of 61 students in the same course but in a class not using group learning techniques, the sample mean and standard deviation of the scores were 304.61 and 62.61, respectively. Find a 95% confidence interval for the difference between the two population mean scores.

We are studying the difference between two sample means, which means we will use either Formula #4, or #5. We can’t use the matched pairs formula because the two samples are different sizes, so we use #5:

**Example : ** In a survey of practicing certified public accountants on women in the accounting profession, sample members were asked to respond on a scale of one (strongly disagree) to five (strongly agree) to the statement: “Women are equally acceptable to clients as are men to perform work on engagements.” For a random sample of 172 female accountants, the mean response was 3.483 and the sample standard deviation was 0.970. For an independent random sample of 186 male accountants, the sample mean and standard deviation were 3.435 and 0.894, respectively. Find a 95% confidence interval for the difference between the two population means.

Same as the previous problem; use Formula #5.

**Example : ** For a random sample of 190 firms that revalued their fixed assets, the mean ratio of debt to tangible assets was 0.517 and the sample standard deviation was 0.148. For an independent random sample of 417 firms that did not revalue their fixed assets, the mean ratio of debt to tangible assets was 0.489 and the sample standard deviation was 0.159. Find a 99% confidence interval for the difference between the two population means.

Formula #5:

**Example : ** Of a random sample of 1203 business students in 1979, 20.2% said that teaching, as a career was very unappealing. Of an independent random sample of 1203 business students taken in 1989, 13.2% had this reaction to teaching as a career. Find a 99% confidence interval for the difference between the population proportions regarding teaching as very unappealing in the two years.

Now we have the difference between two proportions, so we use Formula #6:

**Example: ** Supermarket shoppers were observed, and questioned immediately after putting an item in their cart. Of a random sample of 570 choosing a product at the regular price, 308 claimed to check price at the point of choice. Of an independent random sample of 232 choosing a product at a special price, 157 made this claim. Find a 90% confidence interval for the difference between the two population proportions.

Formula #6:

**Example : **. In a random sample of 1,158 newly promoted executives, 47.9% rated a statistics course as very important or somewhat important as part of the preparation for a career in general management.

(a) Find a 99% confidence interval for the population proportion of all newly promoted executives holding this view.

Using Formula #3,

(b) Based on this sample information, a statistician computed a confidence interval for the population proportion running from 0.458 to 0.500. What is the probability content of this interval?

This is similar to the problem above regarding 323 union members, in that we will be working backwards to arrive at a confidence level. Each of these limits is 0.021 away from 0.479.

**Example : ** Of a random sample of 151 marketing executives in consumer goods manufacturing, 76.0% said that brand identification held by incumbents was an important or extremely important barrier to entering new markets. Based on this information, a statistician computed, for the population proportion with this view, the confidence interval

0.720 < p< 0.800

Find the probability content of this interval.

Same as the previous problem. Note that the center of this interval is 0.760.

**Example : **. Of a random sample of 69 Canadian industrial firms, 43 did market research in-house. Of an independent random sample of 69 Canadian consumer goods firms, 30 did market research in-house. Find a 95% confidence interval for the difference between the population proportions of these two types of firms that do market research in-house.

Formula #6:

,
,

Copyright 2012