Estimation and Hypothesis Testing

… a core topic in Quantitative Methods and Atlas104

HypothesisTestTopic description

This topic examines estimation and hypothesis testing.

The treatment of this topic on the Atlas follows almost precisely that in Chapter X, Estimation, and Chapter XI, Logic of Hypothesis Testing, of OnlineStatBook, Online Statistics Education – An Interactive Multimedia Course of Study, http://onlinestatbook.com/2/index.html, accessed 15 May 2016.

Topic learning outcome

Familiarity with the sampling distribution and how sampling distributions are used in inferential statistics including following core concepts and terms.

[Note: Until Atlas pages are created for individual concepts in Quantitative Methods, the links in the concepts below point directly to the relevant pages in OnlineStatBook.]

Core concepts associated with this topic
Introduction to Estimation

Degrees of Freedom

Characteristics of Estimators

Bias and Variability Simulation

Confidence Intervals

Introduction to Confidence Intervals

Confidence Interval for the Mean

t distribution

Confidence Interval Simulation

Confidence Interval for the Difference Between Means

Confidence Interval for Pearson’s Correlation

Confidence Interval for a Proportion

Introduction to Hypothesis Testing

Significance Testing

Type I and Type II Errors

One- and Two-Tailed Tests

Interpreting Significant Results

Interpreting Non-Significant Results

Steps in Hypothesis Testing

Significance Testing and Confidence Intervals

Misconceptions in Hypothesis Testing

Readings

Read and/or watch video for each of the concept pages above (top to bottom, starting in left column).

Read the Statistical Literacy exercise, No “Large Conclusions” from “Tiny” Samples?, and answer the question.

Read the Statistical Literacy exercise, Evidence for the Higgs Boson, and answer the question.

Read the Angry Moods (AM) case study and complete following exercises:

  • Is there a difference in how much males and females use aggressive behavior to improve an angry mood? For the “Anger-Out” scores, compute a 99% confidence interval on the difference between gender means. (relevant section)
  • Calculate the 95% confidence interval for the difference between the mean Anger-In score for the athletes and non-athletes. What can you conclude? (relevant section) 
  • Find the 95% confidence interval on the population correlation between the Anger-Out and Control-Out scores. (relevant section)

Read the Animal Research (AR) case study and answer the following questions:

  • What percentage of the women studied in this sample strongly agreed (gave a rating of 7) that using animals for research is wrong?
  • Use the proportion you computed in #29. Compute the 95% confidence interval on the population proportion of women who strongly agree that animal research is wrong. (relevant section)
  • Compute a 95% confidence interval on the difference between the gender means with respect to their beliefs that animal research is wrong. (relevant section)

Read the ADHD Treatment (AT) case study and answer the following questions:

  • What is the correlation between the participants’ correct number of responses after taking the placebo and their correct number of responses after taking 0.60 mg/kg of MPH? Compute the 95% confidence interval on the population correlation. (relevant section)

Read the Weapons and Aggression (WA) case study and answer the following questions:

  • Recall that the hypothesis is that a person can name an aggressive word more quickly if it is preceded by a weapon word prime than if it is preceded by a neutral word prime. The first step in testing this hypothesis is to compute the difference between (a) the naming time of aggressive words when preceded by a neutral word prime and (b) the naming time of aggressive words when preceded by a weapon word prime separately for each of the 32 participants. That is, compute an – aw for each participant.
  • Would the hypothesis of this study be supported if the difference were positive or if it were negative?
  • What is the mean of this difference score? (relevant section)
  • What is the standard deviation of this difference score? (relevant section)
  • What is the 95% confidence interval of the mean difference score? (relevant section)
  • What does the confidence interval computed in (d) say about the hypothesis.

Read the Diet and Health (WA) case study and do the following exercise:

  • Compute a 95% confidence interval on the proportion of people who are healthy on the AHA diet.
Cancers
Deaths
Nonfatal illness
Healthy
Total
AHA
15
24
25
239
303
Mediterranean
7
14
8
273
302
Total
22
38
33
512
605
Assessment questions

From http://onlinestatbook.com/2/estimation/ch8_exercises.html and http://onlinestatbook.com/2/logic_of_hypothesis_testing/ch9_exercises.html accessed 15 May 2016.

You may want to use the Analysis Lab and various calculators for some of these exercises.

Calculators:

Inverse t Distribution: Finds t for a confidence interval.
t Distribution: Computes areas of the t distribution.
Fisher’s r to z’: Computes transformations in both directions.
Inverse Normal Distribution: Use for confidence intervals.

AQ104.08.01. When would the mean grade in a class on a final exam be considered a statistic? When would it be considered a parameter? (relevant section)

AQ104.08.02. Define bias in terms of expected value. (relevant section)

AQ104.08.03. Is it possible for a statistic to be unbiased yet very imprecise? How about being very accurate but biased? (relevant section)

AQ104.08.04. Why is a 99% confidence interval wider than a 95% confidence interval? (relevant section & relevant section)

AQ104.08.05. When you construct a 95% confidence interval, what are you 95% confident about? (relevant section)

AQ104.08.06. What is the difference in the computation of a confidence interval between cases in which you know the population standard deviation and cases in which you have to estimate it? (relevant section & relevant section)

AQ104.08.07. Assume a researcher found that the correlation between a test he or she developed and job performance was 0.55 in a study of 28 employees. If correlations under .35 are considered unacceptable, would you have any reservations about using this test to screen job applicants? (relevant section)

AQ104.08.07. What is the effect of sample size on the width of a confidence interval? (relevant section & relevant section)

AQ104.08.08. How does the t distribution compare with the normal distribution? How does this difference affect the size of confidence intervals constructed using z relative to those constructed using t? Does sample size make a difference? (relevant section)

AQ104.08.09. The effectiveness of a blood-pressure drug is being investigated. How might an experimenter demonstrate that, on average, the reduction in systolic blood pressure is 20 or more? (relevant section & relevant section)

AQ104.08.10. A population is known to be normally distributed with a standard deviation of 2.8. AQ104.08.10.1 Compute the 95% confidence interval on the mean based on the following sample of nine: 8, 9, 10, 13, 14, 16, 17, 20, 21. AQ104.08.10.2 Now compute the 99% confidence interval using the same data. (relevant section)

AQ104.08.11. A person claims to be able to predict the outcome of flipping a coin. This person is correct 16/25 times. AQ104.08.11.1 Compute the 95% confidence interval on the proportion of times this person can predict coin flips correctly. AQ104.08.11.2 What conclusion can you draw about this test of his ability to predict the future? (relevant section)

AQ104.08.12. What does it mean that the variance (computed by dividing by N) is a biased statistic? (relevant section)

AQ104.08.13. A confidence interval for the population mean computed from an N of 16 ranges from 12 to 28. A new sample of 36 observations is going to be taken. You can’t know in advance exactly what the confidence interval will be because it depends on the random sample. Even so, you should have some idea of what it will be. Give your best estimation. (relevant section)

AQ104.08.14. You take a sample of 22 from a population of test scores, and the mean of your sample is 60. AQ104.08.14.1 You know the standard deviation of the population is 10. What is the 99% confidence interval on the population mean. AQ104.08.14.2 Now assume that you do not know the population standard deviation, but the standard deviation in your sample is 10. What is the 99% confidence interval on the mean now? (relevant section)

AQ104.08.15. You read about a survey in a newspaper and find that 70% of the 250 people sampled prefer Candidate A. You are surprised by this survey because you thought that more like 50% of the population preferred this candidate. AQ104.08.15.1 Based on this sample, is 50% a possible population proportion? AQ104.08.15.2 Compute the 95% confidence interval to be sure. (relevant section)

AQ104.08.16. Heights for teenage boys and girls were calculated. The mean height for the sample of 12 boys was 174 cm and the variance was 62. For the sample of 12 girls, the mean was 166 cm and the variance was 65. AQ104.08.16.1 What is the 95% confidence interval on the difference between population means? AQ104.08.16.2 What is the 99% confidence interval on the difference between population means? AQ104.08.16.3 Do you think the mean difference in the population could be about 5? Why or why not? (relevant section)

AQ104.08.17. You were interested in how long the average psychology major at your college studies per night, so you asked 10 psychology majors to tell you the amount they study. They told you the following times: 2, 1.5, 3, 2, 3.5, 1, 0.5, 3, 2, 4. AQ104.08.17.1 Find the 95% confidence interval on the population mean. AQ104.08.17.2 Find the 90% confidence interval on the population mean. (relevant section)

AQ104.08.18. True/false: As the sample size gets larger, the probability that the confidence interval will contain the population mean gets higher. (relevant section & relevant section)

AQ104.08.19. True/false: You have a sample of 9 men and a sample of 8 women. The degrees of freedom for the t value in your confidence interval on the difference between means is 16. (relevant section & relevant section)

AQ104.08.20. True/false: Greek letters are used for statistics as opposed to parameters. (relevant section)

AQ104.08.21. True/false: In order to construct a confidence interval on the difference between means, you need to assume that the populations have the same variance and are both normally distributed. (relevant section)

AQ104.08.22. True/false: The red distribution represents the t distribution and the blue distribution represents the normal distribution. (relevant section)

You may want to use the Binomial Calculator for some of these exercises.

AQ104.08.23. An experiment is conducted to test the claim that James Bond can taste the difference between a Martini that is shaken and one that is stirred. What is the null hypothesis? (relevant section)

AQ104.08.24. The following explanation is incorrect. What three words should be added to make it correct? (relevant section)

AQ104.08.24.1 The probability value is the probability of obtaining a statistic as different from the parameter specified in the null hypothesis as the statistic obtained in the experiment.

AQ104.08.24.2 The probability value is computed assuming that the null hypothesis is true.

AQ104.08.25. Why do experimenters test hypotheses they think are false? (relevant section)

AQ104.08.26. State the null hypothesis for:

AQ104.08.26.1 An experiment testing whether echinacea decreases the length of colds.

AQ104.08.26.2 A correlational study on the relationship between brain size and intelligence.

AQ104.08.26.3 An investigation of whether a self-proclaimed psychic can predict the outcome of a coin flip.

AQ104.08.26.4 A study comparing a drug with a placebo on the amount of pain relief. (A one-tailed test was used.)
(relevant section & relevant section)

AQ104.08.27. Assume the null hypothesis is that µ = 50 and that the graph shown below is the sampling distribution of the mean (M). AQ104.08.27.1 Would a sample value of M= 60 be significant in a two-tailed test at the .05 level? AQ104.08.27.2 Roughly what value of M would be needed to be significant? (relevant section & relevant section)

AQ104.08.28. A researcher develops a new theory that predicts that vegetarians will have more of a particular vitamin in their blood than non-vegetarians. An experiment is conducted and vegetarians do have more of the vitamin, but the difference is not significant. The probability value is 0.13. Should the experimenter’s confidence in the theory increase, decrease, or stay the same? (relevant section)

AQ104.08.28. A researcher hypothesizes that the lowering in cholesterol associated with weight loss is really due to exercise. To test this, the researcher carefully controls for exercise while comparing the cholesterol levels of a group of subjects who lose weight by dieting with a control group that does not diet. The difference between groups in cholesterol is not significant. Can the researcher claim that weight loss has no effect? (relevant section)

A significance test is performed and p = .20. Why can’t the experimenter claim that the probability that the null hypothesis is true is .20? (relevant section, relevant section & relevant section)

AQ104.08.30. For a drug to be approved by the FDA, the drug must be shown to be safe and effective. If the drug is significantly more effective than a placebo, then the drug is deemed effective. What do you know about the effectiveness of a drug once it has been approved by the FDA (assuming that there has not been a Type I error)? (relevant section)

AQ104.08.31.1 When is it valid to use a one-tailed test? AQ104.08.31.2 What is the advantage of a one-tailed test? AQ104.08.31.3 Give an example of a null hypothesis that would be tested by a one-tailed test. (relevant section)

AQ104.08.32. Distinguish between probability value and significance level. (relevant section)

AQ104.08.33. Suppose a study was conducted on the effectiveness of a class on “How to take tests.” The SAT scores of an experimental group and a control group were compared. (There were 100 subjects in each group.) The mean score of the experimental group was 503 and the mean score of the control group was 499. The difference between means was found to be significant, p = .037. What do you conclude about the effectiveness of the class? (relevant section & relevant section)

AQ104.08.34.1 Is it more conservative to use an alpha level of .01 or an alpha level of .05? AQ104.08.34.2 Would beta be higher for an alpha of .05 or for an alpha of .01? (relevant section)

AQ104.08.35. Why is “Ho: “M1 = M2” not a proper null hypothesis? (relevant section)

AQ104.08.36.1 An experimenter expects an effect to come out in a certain direction. Is this sufficient basis for using a one-tailed test? AQ104.08.36.2 Why or why not? (relevant section)

AQ104.08.37. How do the Type I and Type II error rates of one-tailed and two-tailed tests differ? (relevant section & relevant section)

AQ104.08.38.1 A two-tailed probability is .03. What is the one-tailed probability if the effect were in the specified direction? AQ104.08.38.2 What would it be if the effect were in the other direction? (relevant section)

AQ104.08.39. You choose an alpha level of .01 and then analyze your data. AQ104.08.39.1 What is the probability that you will make a Type I error given that the null hypothesis is true? AQ104.08.39.1 What is the probability that you will make a Type I error given that the null hypothesis is false? (relevant section)

AQ104.08.40. Why doesn’t it make sense to test the hypothesis that the sample mean is 42? (relevant section & relevant section)

AQ104.08.41. True/false: It is easier to reject the null hypothesis if the researcher uses a smaller alpha (α) level. (relevant section & relevant section)

AQ104.08.42. True/false: You are more likely to make a Type I error when using a small sample than when using a large sample. (relevant section)

AQ104.08.43. True/false: You accept the alternative hypothesis when you reject the null hypothesis. (relevant section)

AQ104.08.44. True/false: You do not accept the null hypothesis when you fail to reject it. (relevant section)

AQ104.08.45. True/false: A researcher risks making a Type I error any time the null hypothesis is rejected. (relevant section)

Questions from Case Studies:

The following questions are from the Angry Moods (AM) case study.

AQ104.08.46. Is there a difference in how much males and females use aggressive behavior to improve an angry mood? For the “Anger-Out” scores, compute a 99% confidence interval on the difference between gender means. (relevant section)

AQ104.08.47. Calculate the 95% confidence interval for the difference between the mean Anger-In score for the athletes and non-athletes. What can you conclude? (relevant section)

AQ104.08.48. Find the 95% confidence interval on the population correlation between the Anger-Out and Control-Out scores. (relevant section)

The following questions are from the Flatulence (F) case study.

AQ104.08.49. Compare men and women on the variable “perday.” Compute the 95% confidence interval on the difference between means. (relevant section)

AQ104.08.50. What is the 95% confidence interval of the mean time people wait before farting in front of a romantic partner. (relevant section)

The following questions use data from the Animal Research (AR) case study.

AQ104.08.51. What percentage of the women studied in this sample strongly agreed (gave a rating of 7) that using animals for research is wrong?

AQ104.08.52. Use the proportion you computed in AQ104.08.51. Compute the 95% confidence interval on the population proportion of women who strongly agree that animal research is wrong. (relevant section)

AQ104.08.53. Compute a 95% confidence interval on the difference between the gender means with respect to their beliefs that animal research is wrong. (relevant section)
The following question is from the ADHD Treatment (AT) case study.

AQ104.08.54. What is the correlation between the participants’ correct number of responses after taking the placebo and their correct number of responses after taking 0.60 mg/kg of MPH? Compute the 95% confidence interval on the population correlation. (relevant section)

The following question is from the Weapons and Aggression (WA) case study.

AQ104.08.55. Recall that the hypothesis is that a person can name an aggressive word more quickly if it is preceded by a weapon word prime than if it is preceded by a neutral word prime. The first step in testing this hypothesis is to compute the difference between (a) the naming time of aggressive words when preceded by a neutral word prime and (b) the naming time of aggressive words when preceded by a weapon word prime separately for each of the 32 participants. That is, compute an – aw for each participant.

AQ104.08.56. Would the hypothesis of this study be supported if the difference were positive or if it were negative?

AQ104.08.57. What is the mean of this difference score? (relevant section)

AQ104.08.58. What is the standard deviation of this difference score? (relevant section)

AQ104.08.59. What is the 95% confidence interval of the mean difference score? (relevant section)

AQ104.08.60. What does the confidence interval computed in (d) say about the hypothesis.

The following question is from the Diet and Health (WA) case study.

AQ104.08.61. Compute a 95% confidence interval on the proportion of people who are healthy on the AHA diet.

Cancers
Deaths
Nonfatal illness
Healthy
Total
AHA
15
24
25
239
303
Mediterranean
7
14
8
273
302
Total
22
38
33
512
605

The following questions are from

Visit the site

AQ104.08.62. Suppose that you take a random sample of 10,000 Americans and find that 1,111 are left-handed. You perform a test of significance to assess whether the sample data provide evidence that more than 10% of all Americans are left-handed, and you calculate a test statistic of 3.70 and a p-value of .0001. Furthermore, you calculate a 99% confidence interval for the proportion of left-handers in America to be (.103,.119). Consider the following statements: The sample provides strong evidence that more than 10% of all Americans are left-handed. The sample provides evidence that the proportion of left-handers in America is much larger than 10%. Which of these two statements is the more appropriate conclusion to draw? Explain your answer based on the results of the significance test and confidence interval.

AQ104.08.63. A student wanted to study the ages of couples applying for marriage licenses in his county. He studied a sample of 94 marriage licenses and found that in 67 cases the husband was older than the wife. Do the sample data provide strong evidence that the husband is usually older than the wife among couples applying for marriage licenses in that county? Explain briefly and justify your answer.

AQ104.08.64. Imagine that there are 100 different researchers each studying the sleeping habits of college freshmen. Each researcher takes a random sample of size 50 from the same population of freshmen. Each researcher is trying to estimate the mean hours of sleep that freshmen get at night, and each one constructs a 95% confidence interval for the mean. Approximately how many of these 100 confidence intervals will NOT capture the true mean?

a. None

b. 1 or 2

c. 3 to 7

d. about half

e. 95 to 100

f. other

Page created by: Ian Clark, last modified on 12 June 2017.

Image: Pinterest, at https://www.pinterest.com/pin/430656783095355851/, accessed 15 May 2016.