Atlas104 Quantitative Methods

… one of the core Atlas Courses

QuantitativeMethods3Atlas course syllabus

This course covers the core topics and concepts in the public management subject of Quantitative Methods – the principles and techniques of quantitative methods that are most useful in analyzing public policy.

Warning to readers! This is a “long” web page. Below the topic and concept tables you will find the week-by-week readings and exercises associated with the estimated 120 hours of study required by an MPP or MPA student to master the 12 core topics and below that are several hundred assessment questions. On 16 May 2016 this web page had 23,600 words, the equivalent of a 72-page Word document in Calibri 11-point.

Learning outcomes

On successful completion of this course students will have the skills and knowledge to be able to analyze public management problems by appropriately utilizing the theories and principles in the topics and concepts noted below.

Normed topics

The topics are normed in having a volume of content capable of being taught in one course-week of instruction − nominally 3 hours of in-class work and 7 hours of outside-class reading.

  1. The Study of Quantitative Methods
  2. Describing Distributions
  3. Bivariate Data
  4. Probability Theory
  5. Research Design
  6. Normal Distributions and Advanced Graphs
  7. Sampling Distributions
  8. Estimation and Hypothesis Testing
  9. Tests of Means and Power
  10. Regression
  11. Analysis of Variance
  12. Transformation, Chi Square, Distribution Free Tests, and Effect Size

Like other normed topics on the Atlas, each of these has a topic description, links to core concepts relevant to the topic, learning outcomes, a reading list, and a series of assessment questions.

Core concepts and terms
The Study of Quantitative Methods

What are Statistics?

Importance of Statistics

Descriptive Statistics

Inferential Statistics

Variables

Percentiles

Levels of Measurement

Distributions

Summation Notation

Linear Transformations

Logarithms

Describing Distributions

Qualitative Variables

Quantitative Variables

Stem and Leaf Displays

Histograms

Frequency Polygons

Box Plots

Box Plot Demonstration

Bar Charts

Line Graphs

Dot Plots

Central Tendency

What is Central Tendency

Measures of Central Tendency

Median and Mean

Additional Measures

Comparing measures

Variability

Measures of Variability

Estimating Variance Simulation

Shape

Effects of Transformations

Variance Sum Law I

Bivariate Data

Introduction to Bivariate Data

Values of the Pearson Correlation

Guessing Correlations Simulation

Properties of Pearson’s r

Computing Pearson’s r

Restriction of Range

Variance Sum Law II

Probability Theory

Introduction to Probability

Basic Concepts

Conditional Probability

Gambler’s Fallacy

Permutations and Combinations

Birthday Simulation

Binomial Distribution

Binomial Demonstration

Poisson Distribution

Multinomial Distribution

Hypergeometric Distribution

Base Rates

Bayes’ Theorem

Monty Hall Problem

 

Research Design

Scientific Method

Measurement

Basics of Data Collection

Sampling Bias

Experimental Designs

Causation

Normal Distributions and Advanced Graphs

Introduction to Normal Distributions

History

Areas of Normal Distributions

Varieties of Normal Distributions

Standard Normal

Normal Approximation to the Binomial

Q-Q Plots

Contour Plots

3D Plots

Sampling Distributions

Introduction to Sampling

Sample Size

Central Limit Theorem

Sampling Distribution of the Mean

Sampling Distribution of Difference Between Means

Sampling Distribution of Pearson’s r

Sampling Distribution of a Proportion

Estimation and Hypothesis Testing

Introduction to Estimation

Degrees of Freedom

Characteristics of Estimators

Bias and Variability Simulation

Confidence Intervals

Introduction to Confidence Intervals

Confidence Interval for the Mean

t distribution

Confidence Interval Simulation

Confidence Interval for the Difference Between Means

Confidence Interval for Pearson’s Correlation

Confidence Interval for a Proportion

Introduction to Hypothesis Testing

Significance Testing

Type I and Type II Errors

One- and Two-Tailed Tests

Interpreting Significant Results

Interpreting Non-Significant Results

Steps in Hypothesis Testing

Significance Testing and Confidence Intervals

Misconceptions

 

Tests of Means and Power

Single Mean

t Distribution Demo

Difference between Two Means (Independent Groups)

Robustness Simulation

All Pairwise Comparisons Among Means

Specific Comparisons

Difference between Two Means (Correlated Pairs)

Correlated t Simulation

Specific Comparisons (Correlated Observations)

Pairwise Comparisons (Correlated Observations)

Power and Null Hypothesis

Factors Affecting Power

Regression

Introduction to Simple Linear Regression

Linear Fit Demo

Partitioning Sums of Squares

Standard Error of the Estimate

Inferential Statistics for b and r

Influential Observations

Regression Toward the Mean

Introduction to Multiple Regression

Analysis of Variance

Introduction to ANOVA

ANOVA Designs

One-Factor ANOVA (Between-Subjects)

One-Way Comparisons

Multi-Factor ANOVA (Between-Subjects)

Unequal Sample Sizes

Tests Supplementing ANOVA

Within-Subjects ANOVA

Power of Within-Subjects Designs

Transformation, Chi Square, Distribution Free Tests, and Effect Size

Log

Tukey’s Ladder of Powers

Box-Cox Transformations

Chi Square Distribution

One-Way Tables (Testing Goodness of Fit)

Testing Distributions Demo

Contingency Tables

2 x 2 Table Simulation

Benefits of Distribution-Free Tests

Randomization Tests – Association (Pearson’s r)

Randomized Tests – Contingency Tables (Fisher’s Exact Test)

Rank Randomization Tests – Two Conditions (Mann-Whitney U, Wilcoxon Rank Sum)

Rank Randomization Tests – Two or More Conditions (Kruskal-Wallis)

Rank Randomization Tests – Association (Spearman’s ρ)

Proportions

Difference between Means

Variance Explained

 

Sources

The topics and concepts for this course reflect those taught in the required courses of highly regarded programs. See, for example, the detailed comparison of syllabi for the first-year required courses in quantitative methods at Harvard HKS, Michigan Ford, NYU Wagner, and Wisconsin La Follette in Comparing Course Workload – Quantitative Methods. We believe that the content of these courses is well captured in the remarkable open access resource created by a team of professors led by Rice University:

OnlineStatBook, Online Statistics Education – An Interactive Multimedia Course of Study, at http://onlinestatbook.com/2/index.html, PDF version entitled Introduction to Statistics at http://onlinestatbook.com/Online_Statistics_Education.pdf, accessed 13 May 2016.

We have therefore used OnlineStatBook to structure the sequence of topics and the concept within the topics. Indeed, until we have time to create Atlas pages for each of the concepts we use section headings from OnlineStatBook as the concept name with the link pointing directly to the page on OnlineStatBook. Almost all the assessment questions used in Atlas104 come from OnlineStatBook.

Two other open access resources are worth noting:

Khan Academy, Probability and Statistics, at https://www.khanacademy.org/math/probability, accessed 12 May 2016.

Saylor Academy Open Textbooks, Introductory Statistics, HTML version at https://saylordotorg.github.io/text_introductory-statistics/, PDF version at http://www.saylor.org/site/textbooks/Introductory%20Statistics.pdf, accessed 11 May 2016.

The alignment between the Atlas topics and these three sources is illustrated in the table below:

Atlas Quantitative Method Normed Topics
OnlineStatBook Chapters
Saylor Academy Introductory Statistics Online Textbook Chapters
Khan Academy Probability and Statistics Modules
The Study of Quantitative Methods Introduction Chapter 1: Introduction Independent and dependent events
Describing Distributions Graphing Distributions

Summarizing Distributions

Chapter 2: Descriptive Statistics Descriptive statistics
Bivariate Data Describing Bivariate Data Chapter 4: Discrete Random Variables

Chapter 5: Continuous Random Variables

Probability Probability Chapter 3: Basic Concepts of Probability Probability and combinatorics
Research Design Research Design Statistical studies
Normal Distributions and Advanced Graphs Normal Distribution

Advanced Graphs

Random variables and probability distributions
Sampling Distributions Sampling Distributions Chapter 6: Sampling Distributions
Estimation and Hypothesis Testing Estimation

Logic of Hypothesis Testing

Chapter 7: Estimation

Chapter 8: Testing Hypotheses

Inferential statistics
Tests of Means and Power Tests of MeansPower Chapter 9: Two-Sample Problems
Regression Regression Chapter 10: Correlation and Regression Regression
Analysis of Variance Analysis of Variance
Transformation, Chi Square, Distribution Free Tests, and Effect Size TransformationsChi Square

Distribution Free Tests

Effect Size

Chapter 11: Chi-Square Tests and F-Tests

Recommended readings

Week 1: The Study of Quantitative Methods

Read and/or watch video for each of the concept pages in The Study of Quantitative Methods (top to bottom, starting in left column).

Read the Statistical Literacy exercises, Do Athletes Get Special Treatment? and Statistical Errors in Politics, and answer the questions.

Read the Angry Moods (AM) case study and complete following exercises:

  • Which variables are the participant variables? (They act as independent variables in this study.) (relevant section)
  • What are the dependent variables? (relevant section)
  • Is Anger-Out a quantitative or qualitative variable? (relevant section)

Read the Teacher Ratings (TR) case study and complete following exercise:

Read the ADHD Treatment (AT) case study and complete the following exercises:

  • What is the independent variable of this experiment? How many levels does it have? (relevant section)
  • What is the dependent variable? On what scale (nominal, ordinal, interval, ratio) was it measured? (relevant section)

Week 2: Describing Distributions

Read and/or watch video for each of the concept pages in Describing Distributions (top to bottom, starting in left column).

Read the Statistical Literacy exercises, Are Commercial Vehicles in Texas Unsafe? and Linear By Design, and answer the questions.

Read the Angry Moods (AM) case study and complete following exercises:

  • Is there a difference in how much males and females use aggressive behavior to improve an angry mood? For the “Anger-Out” scores:

a. Create parallel box plots. (relevant section)

b. Create a back to back stem and leaf displays (You may have trouble finding a computer to do this so you may have to do it by hand.) (relevant section)

  • Create parallel box plots for the Anger-In scores by sports participation. (relevant section)
  • Plot a histogram of the distribution of the Control-Out scores. (relevant section)
  • Create a bar graph comparing the mean Control-In score for the athletes and the non-athletes. What would be a better way to display this data? (relevant section)
  • Plot parallel box plots of the Anger Expression Index by sports participation. Does it look like there are any outliers? Which group reported expressing more anger? (relevant section)

Read the Flatulence (F) case study and complete following exercise:

  • Plot a histogram of the variable “per day.” (relevant section)
  • Based on a histogram of the variable “perday”, do you think the mean or median of this variable is larger? Calculate the mean and median to see if you are right. (relevant section & relevant section)
  • Create parallel box plots of “how long” as a function gender. Why is the 25th percentile not showing? What can you say about the results? (relevant section)
  • Create a stem and leaf plot of the variable “how long” What can you say about the shape of the distribution? (relevant section)

Read the Stroop (S) case study and complete following exercise:

Read the Physicians’ Reactions (PR) case study and answer the following questions.

  • Create box plots comparing the time expected to be spent with the average-weight and overweight patients. (relevant section)
  • What is the mean expected time spent for the average-weight patients? What is the mean expected time spent for the overweight patients? (relevant section)
  • What is the difference in means between the groups? By approximately how many standard deviations do the means differ?
    (relevant section & relevant section)
  • Plot histograms of the time spent with the average-weight and overweight patients. (relevant section)
  • To which group does the patient with the highest expected time belong?

Read the Smiles and Leniency (SL) case study and answer the following questions:

  • Create parallel box plots for the four conditions. (relevant section)
  • Find the mean, median, standard deviation, and interquartile range for the leniency scores of each of the four groups. (relevant section & relevant section)
  • Create back to back stem and leaf displays for the false smile and neutral conditions. (It may be hard to find a computer program to do this for you, so be prepared to do it by hand). (relevant section)

Read the ADHD Treatment (AT) case study and complete the following exercises:

  • Create a line graph of the data. Do certain dosages appear to be more effective than others? (relevant section)
  • What is the mean number of correct responses of the participants after taking the placebo (0 mg/kg)? (relevant section)
  • Create a stem and leaf plot of the number of correct responses of the participants after taking the placebo (d0 variable). What can you say about the shape of the distribution? (relevant section)
  • What are the standard deviation and the interquartile range of the d0 condition? (relevant section)
  • Create box plots for the four conditions. You may have to rearrange the data to get a computer program to create the box plots.

Read the SAT and College GPA case study and answer following questions:

  • Create histograms and stem and leaf displays of both high-school grade point average and university grade point average. In what way(s) do the distributions differ?

Week 3: Bivariate Data

Read and/or watch video for each of the concept pages in Bivariate Data (top to bottom, starting in left column).

Read the Statistical Literacy exercises, Age and Sleep, and answer the questions.

Read the Angry Moods (AM) case study and complete following exercises:

  • What is the correlation between the Control-In and Control-Out scores? (relevant section)
  • Would you expect the correlation between the Anger-Out and Control-Out scores to be positive or negative? Compute this correlation. (relevant section & relevant section)

Read the Flatulence (F) case study and complete following exercise:

  • Is there a relationship between the number of male siblings and embarrassment in front of romantic interests? Create a scatterplot and compute r. (relevant section & relevant section)

Read the Stroop (S) case study and complete following exercise:

  • Create a scatterplot showing “colors” on the Y-axis and “words” on the X-axis. (relevant section)
  • Compute the correlation between “colors” and “words.” (relevant section)
  • Sort the data by color-naming time. Choose only the 20 fastest color-namers and create a scatterplot. (relevant section)
  • (a) What is the new correlation? (relevant section)
    (b) What is the technical term for the finding that this correlation is smaller than the correlation for the full dataset? (relevant section)

Read the Animal Research (AR) case study and answer the following question:

  • What is the overall correlation between the belief that animal research is wrong and belief that animal research is necessary? (relevant section)

Read the ADHD Treatment (AT) case study and complete the following exercises:

  • What is the correlation between the participants’ correct number of responses after taking the placebo and their correct number of responses after taking 0.60 mg/kg of MPH? (relevant section

Week 4: Probability

Read the Statistical Literacy exercises, School shooting: The warning signs, and answer the question.

Read the Diet and Health (DH) case study and answer the following questions.

  • What percentage of people on the AHA diet had some sort of illness or death?
  • What is the probability that if you randomly selected a person on the AHA diet, he or she would have some sort of illness or death? (relevant section)
  • If 3 people on the AHA diet are chosen at random, what is the probability that they will all be healthy? (relevant section)
  • What percentage of people on the Mediterranean diet had some sort of illness or death?
  • What is the probability that if you randomly selected a person on the Mediterranean diet, he or she would have some sort of illness or death? (relevant section)
  • What is the probability that if you randomly selected a person on the Mediterranean diet, he or she would have cancer? (relevant section)
  • If you randomly select five people from the Mediterranean diet, what is the probability that they would all be healthy? (relevant section)

Week 5: Research Design

Read and/or watch video for each of the concept pages in Research Design (top to bottom, starting in left column).

Read the Statistical Literacy exercises, Low HDL and Niacin, and answer the question.

Week 6: Normal Distributions and Advanced Graphs

Read and/or watch video for each of the concept pages in Normal Distributions and Advanced Graphs (top to bottom, starting in left column).

Read the Statistical Literacy exercise, Evaluating “Tail Risk”? and answer the question.

Read the Statistical Literacy exercise, Reading a Weather Map, and answer the question.

Read the Angry Moods (AM) case study and complete following exercises:

  • For this problem, use the Anger Expression (AE) scores. (a) Compute the mean and standard deviation. (b) Then, compute what the 25th, 50th and 75th percentiles would be if the distribution were normal. (c) Compare the estimates to the actual 25th, 50th, and 75th percentiles. (relevant section)

Read the Physicians’ Reactions (PR) case study and answer the following questions.

  • For this problem, use the time spent with the overweight patients. (a) Compute the mean and standard deviation of this distribution. (b) What is the probability that if you chose an overweight participant at random, the doctor would have spent 31 minutes or longer with this person? (c) Now assume this distribution is normal (and has the same mean and standard deviation). Now what is the probability that if you chose an overweight participant at random, the doctor would have spent 31 minutes or longer with this person? (relevant section)

Read the SAT and College GPA case study and do the following:

  • Create a contour plot looking at University GPA as a function of Math SAT and High School GPA. Naturally, you should use a computer to do this.
  • Create a 3D plot using the variables University GPA, Math SAT, and High School GPA. Naturally, you should use a computer to do this.

Week 7: Sampling Distributions

Read and/or watch video for each of the concept pages in Sampling Distributions (top to bottom, starting in left column).

Read the Statistical Literacy exercise, Accuracy of Employment Figures, and answer the question.

Read the Angry Moods (AM) case study and complete following exercises:

  • How many men were sampled?
  • How many women were sampled?
  • What is the mean difference between men and women on the Anger-Out scores?
  • Suppose in the population, the Anger-Out score for men is two points higher than it is for women. The population variances for men and women are both 20. Assume the Anger-Out scores for both genders are normally distributed. Given this information about the population parameters: What is the mean of the sampling distribution of the difference between means? (relevant section)
  • What is the standard error of the difference between means? (relevant section)
  • What is the probability that you would have gotten this mean difference (see #24) or less in your sample? (relevant section)

Read the Animal Research (AR) case study and answer the following questions:

  • How many people were sampled to give their opinions on animal research?
  • What is the correlation in this sample between the belief that animal research is wrong and belief that animal research is necessary? (Ch. 4.E)
  • Suppose the correlation between the belief that animal research is wrong and the belief that animal research is necessary is -.68 in the population. Convert -.68 to z’. (relevant section)
  • Find the standard error of this sampling distribution. (relevant section
  • Assuming the data used in this study was randomly sampled, what is the probability that you would get this correlation or stronger (closer to -1)? (relevant section).

Week 8: Estimation and Hypothesis Testing

Read and/or watch video for each of the concept pages for Estimation and Hypothesis Testing (top to bottom, starting in left column).

Read the Statistical Literacy exercise, Accuracy of Employment Figures, and answer the question.

Read the Angry Moods (AM) case study and complete following exercises:

  • How many men were sampled?
  • How many women were sampled?
  • What is the mean difference between men and women on the Anger-Out scores?
  • Suppose in the population, the Anger-Out score for men is two points higher than it is for women. The population variances for men and women are both 20. Assume the Anger-Out scores for both genders are normally distributed. Given this information about the population parameters: What is the mean of the sampling distribution of the difference between means? (relevant section)
  • What is the standard error of the difference between means? (relevant section)
  • What is the probability that you would have gotten this mean difference (see #24) or less in your sample? (relevant section)

Read the Animal Research (AR) case study and answer the following questions:

  • How many people were sampled to give their opinions on animal research?
  • What is the correlation in this sample between the belief that animal research is wrong and belief that animal research is necessary? (Ch. 4.E)
  • Suppose the correlation between the belief that animal research is wrong and the belief that animal research is necessary is -.68 in the population. Convert -.68 to z’. (relevant section)
  • Find the standard error of this sampling distribution. (relevant section
  • Assuming the data used in this study was randomly sampled, what is the probability that you would get this correlation or stronger (closer to -1)? (relevant section).

Week 9: Tests of Means and Power

Read and/or watch video for each of the concept pages in Tests of Means and Power (top to bottom, starting in left column).

Read and/or watch video for each of the concept pages above (top to bottom, starting in left column).

Read the Statistical Literacy exercise, Surgery for Weight Loss, and answer the question.

Read the Statistical Literacy exercise, Design of Studies for Alzheimer’s Drug, and answer the question.

Read the Angry Moods (AM) case study and complete following exercises:

  • Do athletes or non-athletes calm down more when angry? Conduct a t test to see if the difference between groups in Control-In scores is statistically significant.
  • Do people in general have a higher Anger-Out or Anger-In score? Conduct a t test on the difference between means of these two scores. Are these two means independent or dependent? (relevant section)

Read the Smiles and Leniency (SL) case study and do the following exercises:

  • Compare each mean to the neutral mean. Be sure to control for the familywise error rate. (relevant section)
  • Does a “felt smile” lead to more leniency than other types of smiles? (a) Calculate L (the linear combination) using the following contrast weights false: -1, felt: 2, miserable: -1, neutral: 0. (b) Perform a significance test on this value of L. (relevant section)

Read the Smiles and Leniency (SL) case study and do the following exercises:

  • Compare each mean to the neutral mean. Be sure to control for the familywise error rate. (relevant section)
  • Does a “felt smile” lead to more leniency than other types of smiles? (a) Calculate L (the linear combination) using the following contrast weights false: -1, felt: 2, miserable: -1, neutral: 0. (b) Perform a significance test on this value of L. (relevant section)

Read the Animal Research (AR) case study and answer the following questions:

  • Conduct an independent samples t test comparing males to females on the belief that animal research is necessary. relevant section)
  • Based on the t test you conducted in the previous problem, are you able to reject the null hypothesis if alpha = 0.05? What about if alpha = 0.1? relevant section)
  • Is there any evidence that the t test assumption of homogeneity of variance is violated in the t test you computed in #25? relevant section)

Read the ADHD Treatment (AT) case study and answer the following questions:

  • Compare each dosage with the dosage below it (compare d0 and d15, d15 and d30, and d30 and d60). Remember that the patients completed the task after every dosage. (a) If the familywise error rate is .05, what is the alpha level you will use for each comparison when doing the Bonferroni correction? (b) Which differences are significant at this level? (relevant section)
  • Does performance increase linearly with dosage?
  • Plot a line graph of this data.
  • Compute L for each patient. To do this, create a new variable where you multiply the following coefficients by their corresponding dosages and then sum up the total: (-3)d0 + (-1)d15 + (1)d30 + (3)d60 (see #8). What is the mean of L?
  • Perform a significance test on L. Compute the 95% confidence interval for L. (relevant section)

Week 10: Regression

Read and/or watch video for each of the concept pages in Regression (top to bottom, starting in left column).

Read the Statistical Literacy exercise, Regression Toward the Mean in American Football, and answer the question.

Read the Angry Moods (AM) case study and complete following exercises:

  • Find the regression line for predicting Anger-Out from Control-Out. What is the slope? What is the intercept?
  • Is the relationship at least approximately linear? Test to see if the slope is significantly different from 0.
  • What is the standard error of the estimate?
    (relevant section, relevant section, relevant section)

Read the SAT and GPA (SG) case study and answer the following questions:

  • Find the regression line for predicting the overall university GPA from the high school GPA. What is the slope? What is the y-intercept?
  • If someone had a 2.2 GPA in high school, what is the best estimate of his or her college GPA?
  • If someone had a 4.0 GPA in high school, what is the best estimate of his or her college GPA? (relevant section)

Read the Driving (D) case study and answer the following questions:

  • What is the correlation between age and how often the person chooses to drive in inclement weather? Is this correlation statistically significant at the .01 level? Are older people more or less likely to report that they drive in inclement weather? (relevant section, relevant section )
  • What is the correlation between how often a person chooses to drive in inclement weather and the percentage of accidents the person believes occur in inclement weather? Is this correlation significantly different from 0? (relevant section, relevant section )
  • Use linear regression to predict how often someone rides public transportation in inclement weather from what percentage of accidents that person thinks occur in inclement weather. (Pubtran by Accident) Create a scatter plot of this data and add a regression line. What is the slope? What is the intercept? Is the relationship at least approximately linear? Test if the slope is significantly different from 0. Comment on possible assumption violations for the test of the slope. What is the standard error of the estimate?
    (relevant section, relevant section, relevant section)

Week 11: Analysis of Variance

Read and/or watch video for each of the concept pages in Analysis of Variance (top to bottom, starting in left column).

Read the Statistical Literacy exercises, Resting an Alzheimer’s Drug and Weight Loss and Cancer Risk, and answer the questions.

Read the Stroop Interference case study and do the following exercises:

  • The dataset has the scores (times) for males and females on each of three tasks. a. Do a Gender (2) x Task (3) analysis of variance. b. Plot the interaction.

Read the ADHD Treatment case study and do the following exercise:

  • The data has four scores per subject. a. Is the design between-subjects or within-subjects? b. Create an ANOVA summary table.

Read the Angry Moods (AM) case study and complete following exercise:

  • Using the Anger Expression Index as the dependent variable, perform a 2×2 ANOVA with gender and sports participation as the two factors. Do athletes and non-athletes differ significantly in how much anger they express? Do the genders differ significantly in Anger Expression Index? Is the effect of sports participation significantly different for the two genders? Read the SAT and GPA (SG) case study and answer the following questions:

Read the Weapons and Aggression case study and complete following exercise:

  • Compute a 2×2 ANOVA on this data with the following two factors: prime type (was the first word a weapon or not?) and word type (was the second word aggressive or non-aggressive?). Consider carefully whether the variables are between-subject or within-subects variables.

Read the Smiles and Leniency case study and complete following exercise:

  • Compute the ANOVA summary table.

Week 12: Transformation, Chi Square, Distribution Free Tests, and Effect Size

Read and/or watch video for each of the concept pages in Transformation, Chi Square, Distribution Free Tests, and Effect Size (top to bottom, starting in left column).

Read the Statistical Literacy exercise, Stock Appreciation, and answer the question.

Read the Statistical Literacy exercise, A Spice Inhibits Liver Cancer, and answer the questions.

Read the Statistical Literacy exercise, Troponin Concentration and Ventricular Strain, and answer the question.

Read the Statistical Literacy exercise, Health Effects of Coffee, and answer the question.

Read the ADHD case study and do the following exercises:

  • Transform the data in the placebo condition (D0) with λ’s of .5, 0, -.5, and -1.
  • How does the skew in each of these compare to the skew in the raw data?
  • Which transformation leads to the least skew?

Read the SAT and GPA (SG) case study and do the following exercises:

  • Answer these items to determine if the math SAT scores are normally distributed. You may want to first standardize the scores. (relevant section)

(a) If these data were normally distributed, how many scores would you expect there to be in each of these brackets: (i) smaller than 1 SD below the mean, (ii) in between the mean and 1 SD below the mean, (iii) in between the mean and 1 SD above the mean, (iv) greater than 1 SD above the mean?

(b) How many scores are actually in each of these brackets?

(c) Conduct a Chi Square test to determine if the math SAT scores are normally distributed based on these expected and observed frequencies. (relevant section)

  • Compute Spearman’s ρ for the relationship between UGPA and SAT.

Read the Diet and Health (DH) case study and do the following exercise:

  • Conduct a Pearson Chi Square test to determine if there is any relationship between diet and outcome. Report the Chi Square and p values and state your conclusions. (relevant section)

Read the Stereograms case study and do the following exercise.

  • Test the difference in central tendency between the two conditions using a rank-randomization test (with the normal approximation) with a one-tailed test. Give the Z and the p.

Read the Smiles and Leniency case study and do the following exercise:

  • Test the difference in central tendency between the four conditions using a rank-randomization test (with the normal approximation). Give the Chi Square and the p.

Sample assessment questions

Note: Almost all the assessment questions below come from OnlineStatBook. The detailed references are found on the Atlas topic pages.

Week 1: The Study of Quantitative Methods

AQ104.01.01. A teacher wishes to know whether the males in his/her class have more conservative attitudes than the females. A questionnaire is distributed assessing attitudes and the males and the females are compared. Is this an example of descriptive or inferential statistics? (relevant section 1, relevant section 2)

AQ104.01.02. A cognitive psychologist is interested in comparing two ways of presenting stimuli on subsequent memory. Twelve subjects are presented with each method and a memory test is given. What would be the roles of descriptive and inferential statistics in the analysis of these data? (relevant section 1 & relevant section 2)

AQ104.01.03. If you are told that you scored in the 80th percentile, from just this information would you know exactly what that means and how it was calculated? Explain. (relevant section)

AQ104.01.04. A study is conducted to determine whether people learn better with spaced or massed practice. Subjects volunteer from an introductory psychology class. At the beginning of the semester 12 subjects volunteer and are assigned to the massed-practice condition. At the end of the semester 12 subjects volunteer and are assigned to the spaced-practice condition. This experiment involves two kinds of non-random sampling: (1) Subjects are not randomly sampled from some specified population and (2) Subjects are not randomly assigned to conditions. AQ104.01.04.1 Which of the problems relates to the generality of the results? AQ104.01.04.2 Which of the problems relates to the validity of the results? AQ104.01.04.3 Which problem is more serious? (relevant section)

AQ104.01.05. Give an example of an independent and a dependent variable. (relevant section)

AQ104.01.06. Categorize the following variables as being qualitative or quantitative: (relevant section)

Rating of the quality of a movie on a 7-point scale
Age
Country you were born in
Favorite Color
Time to respond to a question

AQ104.01.07. Specify the level of measurement used for the items in AQ104.01.06. (relevant section)

AQ104.01.08. Which of the following are linear transformations? (relevant section)

Converting from meters to kilometers
Squaring each side to find the area
Converting from ounces to pounds
Taking the square root of each person’s height.
Multiplying all numbers by 2 and then adding 5
Converting temperature from Fahrenheit to Centigrade.

AQ104.01.09. The formula for finding each student’s test grade (g) from his or her raw score (s) on a test is as follows: g = 16 + 3s

AQ104.01.09.1 Is this a linear transformation? AQ104.01.09.2 If a student got a raw score of 20, what is his test grade? (relevant section)

AQ104.01.10. For the numbers 1, 2, 4, 16, compute the following: (relevant section)

AQ104.01.10.1 ΣX
AQ104.01.10.2 ΣX2
AQ104.01.10.3 (ΣX)2

AQ104.01.11.1 Which of the frequency polygons has a large positive skew? AQ104.01.11.2 Which has a large negative skew? (relevant section)

AQ104.01.12. What is more likely to have a skewed distribution: time to solve an anagram problem (where the letters of a word or phrase are rearranged into another word or phrase like “dear” and “read” or “funeral” and “real fun”) or scores on a vocabulary test? (relevant section)

Questions from Case Studies:

The following questions are from the Angry Moods (AM) case study.

AQ104.01.13. (AM#1) Which variables are the participant variables? (They act as independent variables in this study.) (relevant section)

AQ104.01.14. (AM#2) What are the dependent variables? (relevant section)

AQ104.01.15. (AM#3) Is Anger-Out a quantitative or qualitative variable? (relevant section)

The following question is from the Teacher Ratings (TR) case study.

AQ104.01.16. (TR#1) What is the independent variable in this study? (relevant section)

The following questions are from the ADHD Treatment (AT) case study.

AQ104.01.17.1 (AT#1) What is the independent variable of this experiment? AQ104.01.17.2 How many levels does it have? (relevant section)

AQ104.01.18. (AT#2) What is the dependent variable? On what scale (nominal, ordinal, interval, ratio) was it measured? (relevant section)

Week 2: Describing Distributions

AQ104.02.01. Name some ways to graph quantitative variables and some ways to graph qualitative variables. (relevant section & relevant section)

AQ104.02.02. Based on the frequency polygon displayed below, the most common test grade was around what score? Explain. (relevant section)

AQ104.02.03. An experiment compared the ability of three groups of participants to remember briefly-presented chess positions. The data are shown below. The numbers represent the total number of pieces correctly remembered from three chess positions. Create side-by-side box plots for these three groups. What can you say about the differences between these groups from the box plots?
(relevant section)

Non-players
Beginners
Tournament players
22.1
32.5
40.1
22.3
37.1
45.6
26.2
39.1
51.2
29.6
40.5
56.4
31.7
45.5
58.1
33.5
51.3
71.1
38.9
52.6
74.9
39.7
55.7
75.9
43.2
55.9
80.3
43.2
57.7
85.3

AQ104.02.04. You have to decide between displaying your data with a histogram or with a stem and leaf display. What factor(s) would affect your choice? (relevant section & relevant section)

AQ104.02.05. In a box plot, what percent of the scores are between the lower and upper hinges? (relevant section)

AQ104.02.06. A student has decided to display the results of his project on the number of hours people in various countries slept per night. He compared the sleeping patterns of people from the US, Brazil, France, Turkey, China, Egypt, Canada, Norway, and Spain. He was planning on using a line graph to display this data. AQ104.02.06.1 Is a line graph appropriate? AQ104.02.06.2 What might be a better choice for a graph? (relevant section & relevant section)

AQ104.02.07. For the data from the 1977 Stat. and Biom. 200 class for eye color, construct: (relevant section)

AQ104.02.07.1 pie graph

AQ104.02.07.2 horizontal bar graph

AQ104.02.07.3 vertical bar graph

AQ104.02.07.4 a frequency table with the relative frequency of each eye color

Eye Color
Number of students
Brown
11
Blue
10
Green
4
Gray
1

(Question submitted by J. Warren, UNH)

AQ104.02.08. A graph appears below showing the number of adults and children who prefer each type of soda. There were 130 adults and kids surveyed. Discuss some ways in which the graph below could be improved. (relevant section)

AQ104.02.09.1 Which of the box plots below has a large positive skew? AQ104.02.09.2 Which has a large negative skew? (relevant section & relevant section)Questions from Case Studies:

The following questions are from the Angry Moods (AM) case study.

AQ104.02.10.1 (AM#6) Is there a difference in how much males and females use aggressive behavior to improve an angry mood?

AQ104.02.11. For the “Anger-Out” scores:

AQ104.02.11.1 Create parallel box plots. (relevant section)

AQ104.02.11.2 Create a back to back stem and leaf displays (You may have trouble finding a computer to do this so you may have to do it by hand.) (relevant section)

AQ104.02.12. (AM#9) Create parallel box plots for the Anger-In scores by sports participation. (relevant section)

AQ104.02.13. (AM#11) Plot a histogram of the distribution of the Control-Out scores. (relevant section)

AQ104.02.14.1 (AM#14) Create a bar graph comparing the mean Control-In score for the athletes and the non-athletes. AQ104.02.12.2 What would be a better way to display this data? (relevant section)

AQ104.02.15.1 (AM#18) Plot parallel box plots of the Anger Expression Index by sports participation. AQ104.02.15.2 Does it look like there are any outliers? AQ104.02.15.3 Which group reported expressing more anger? (relevant section)

The following questions are from the Flatulence (F) case study.

AQ104.02.16. (F#1) Plot a histogram of the variable “per day.” (relevant section)

AQ104.02.17.1 (F#7) Create parallel box plots of “how long” as a function gender. AQ104.02.17.2 Why is the 25th percentile not showing? AQ104.02.17.3 What can you say about the results? (relevant section)

AQ104.02.18.1 (F#9) Create a stem and leaf plot of the variable “how long.” AQ104.02.18.2 What can you say about the shape of the distribution? (relevant section.1)

The following questions are from the Physicians’ Reactions (PR) case study.

AQ104.02.19. (PR#1) Create box plots comparing the time expected to be spent with the average-weight and overweight patients. (relevant section)

AQ104.02.20. (PR#4) Plot histograms of the time spent with the average-weight and overweight patients. (relevant section)

AQ104.02.21. (PR#5) To which group does the patient with the highest expected time belong?

The following questions are from the Smiles and Leniency (SL) case study

AQ104.02.22. (SL#1) Create parallel box plots for the four conditions. (relevant section)

AQ104.02.23. (SL#3) Create back to back stem and leaf displays for the false smile and neutral conditions. (It may be hard to find a computer program to do this for you, so be prepared to do it by hand). (relevant section)

The following questions are from the ADHD Treatment (AT) case study.

AQ104.02.24.1 (AT#3) Create a line graph of the data. AQ104.02.24.2 Do certain dosages appear to be more effective than others? (relevant section)

AQ104.02.25.1 (AT#5) Create a stem and leaf plot of the number of correct responses of the participants after taking the placebo (d0 variable). AQ104.02.25.2 What can you say about the shape of the distribution? (relevant section)

AQ104.02.26. Create box plots for the four conditions. You may have to rearrange the data to get a computer program to create the box plots.

The following question is from the SAT and College GPA case study.

AQ104.02.27.1 Create histograms and stem and leaf displays of both high-school grade point average and university grade point average. AQ104.02.27.2 In what way(s) do the distributions differ?

AQ104.02.28. The April 10th issue of the Journal of the American Medical Association reports a study on the effects of anti-depressants. The study involved 340 subjects who were being treated for major depression. The subjects were randomly assigned to receive one of three treatments: St. John’s wort (an herb), Zoloft (Pfizer’s cousin of Lilly’s Prozac) or placebo for an 8-week period. The following are the mean scores (approximately) for the three groups of subjects over the eight-week experiment. The first column is the baseline. Lower scores mean less depression. Create a graph to display these means.

Placebo 22.5 19.1 17.9 17.1 16.2 15.1 12.1 12.3
Wort 23.0 20.2 18.2 18.0 16.5 16.1 14.2 13.0
Zoloft 22.4 19.2 16.6 15.5 14.2 13.1 11.8 10.5

AQ104.02.29. questions are from

Visit the site

AQ104.02.29. For the graph below, of heights of singers in a large chorus, please write a complete description of the histogram. Be sure to comment on all the important features.

AQ104.02.30. Pretend you are constructing a histogram for describing the distribution of salaries for individuals who are 40 years or older, but are not yet retired. AQ104.02.30.1 What is on the Y-axis? Explain. AQ104.02.30.2 What is on the X-axis? AQ104.02.30.3 What would be the probable shape of the salary distribution? AQ104.02.30.4 Explain why.

AQ104.02.31. Make up a dataset of 12 numbers with a positive skew. Use a statistical program to compute the skew. AQ104.02.31.1 Is the mean larger than the median as it usually is for distributions with a positive skew? AQ104.02.31.2 What is the value for skew? (relevant section & relevant section )

AQ104.02.32. Repeat AQ104.02.31. only this time make the dataset have a negative skew. (relevant section & relevant section)

AQ104.02.33. Make up three data sets with 5 numbers each that have:

AQ104.02.33.1 the same mean but different standard deviations.
AQ104.02.33.2 the same mean but different medians.
AQ104.02.33.3 the same median but different means.
(relevant section & relevant section)

AQ104.02.34. Find the mean and median for the following three variables:
(relevant section)

A B C
8 4 6
5 4 2
7 6 3
1 3 4
3 4 1

AQ104.02.35. A sample of 30 distance scores measured in yards has a mean of 7, a variance of 16, and a standard deviation of 4. AQ104.02.35.1 You want to convert all your distances from yards to feet, so you multiply each score in the sample by 3. What are the new mean, variance, and standard deviation? AQ104.02.35.2 You then decide that you only want to look at the distance past a certain point. Thus, after multiplying the original scores by 3, you decide to subtract 4 feet from each of the scores. Now what are the new mean, variance, and standard deviation? (relevant section)

AQ104.02.36. You recorded the time in seconds it took for 8 participants to solve a puzzle. These times appear below. However, when the data was entered into the statistical program, the score that was supposed to be 22.1 was entered as 21.2. You had calculated the following measures of central tendency: the mean, the median, and the mean trimmed 25%. Which of these measures of central tendency will change when you correct the recording error? (relevant section & relevant section)

15.2
18.8
19.3
19.7
20.2
21.8
22.1
29.4

AQ104.02.37. For the test scores in AQ104.02.36, which measures of variability (range, standard deviation, variance) would be changed if the 22.1 data point had been erroneously recorded as 21.2? (relevant section)

AQ104.02.38. You know the minimum, the maximum, and the 25th, 50th, and 75th percentiles of a distribution. Which of the following measures of central tendency or variability can you determine?
(relevant section, relevant section & relevant section)

mean, median, mode, trimean, geometric mean,
range, interquartile range, variance, standard deviation

AQ104.02.39. For the numbers 1, 3, 4, 6, and 12:

AQ104.02.39.1 Find the value (v) for which Σ(X-v)2 is minimized.

AQ104.02.39.2 Find the value (v) for which Σ|x-v| is minimized.
(relevant section)

AQ104.02.40. Your younger brother comes home one day after taking a science test. He says that someone at school told him that “60% of the students in the class scored above the median test grade.” AQ104.02.40.1 What is wrong with this statement? AQ104.02.40.2 What if he said “60% of the students scored below the mean?” (relevant section)

AQ104.02.41. An experiment compared the ability of three groups of participants to remember briefly-presented chess positions. The data are shown below. The numbers represent the number of pieces correctly remembered from three chess positions. Compare the performance of each group. Consider spread as well as central tendency. (relevant section, relevant section & relevant section)

Non-players
Beginners
Tournament players
22.1
32.5
40.1
22.3
37.1
45.6
26.2
39.1
51.2
29.6
40.5
56.4
31.7
45.5
58.1
33.5
51.3
71.1
38.9
52.6
74.9
39.7
55.7
75.9
43.2
55.9
80.3
43.2
57.7
85.3

AQ104.02.42. True/False: A bimodal distribution has two modes and two medians. (relevant section)

AQ104.02.43. True/False: The best way to describe a skewed distribution is to report the mean. (relevant section)

AQ104.02.44. True/False: When plotted on the same graph, a distribution with a mean of 50 and a standard deviation of 10 will look more spread out than will a distribution with a mean of 60 and a standard deviation of 5. (relevant section)

AQ104.02.45. Compare the mean, median, trimean in terms of their sensitivity to extreme scores (relevant section).

AQ104.02.46. If the mean time to respond to a stimulus is much higher than the median time to respond, what can you say about the shape of the distribution of response times? (relevant section)

AQ104.02.47. A set of numbers is transformed by taking the log base 10 of each number. The mean of the transformed data is 1.65. What is the geometric mean of the untransformed data? (relevant section)

AQ104.02.48. Which measure of central tendency is most often used for returns on investment?

AQ104.02.49. The histogram is in balance on the fulcrum. What are the mean, median, and mode of the distribution (approximate where necessary)?

Questions from Case Studies:

The following questions are from the Angry Moods (AM) case study.

AQ104.02.50. (AM#4) Does Anger-Out have a positive skew, a negative skew, or no skew? (relevant section)

AQ104.02.51.1 (AM#8) What is the range of the Anger-In scores? AQ104.02.51.2 What is the interquartile range? (relevant section)

AQ104.02.52.1 (AM#12) What is the overall mean Control-Out score? AQ104.02.52.1 What is the mean Control-Out score for the athletes? What is the mean Control-Out score for the non-athletes? (relevant section)

AQ104.02.53.1 (AM#15) What is the variance of the Control-In scores for the athletes? AQ104.02.53.2 What is the variance of the Control-In scores for the non-athletes? (relevant section)

The following question is from the Flatulence (F) case study.

AQ104.02.54.1 (F#2) Based on a histogram of the variable “perday”, do you think the mean or median of this variable is larger? AQ104.02.54.2 Calculate the mean and median to see if you are right. (relevant section & relevant section)

The following questions are from the Stroop (S) case study.

AQ104.02.55. (S#1) Compute the mean for “words”. (relevant section)

 (S#2) Compute the mean and standard deviation for “colors”.
(relevant section & relevant section)

The following questions are from the Physicians’ Reactions (PR) case study.

AQ104.02.57.1 (PR#2) What is the mean expected time spent for the average-weight patients? AQ104.02.57.2 What is the mean expected time spent for the overweight patients? (relevant section)

AQ104.02.58.1 (PR#3) What is the difference in means between the groups? AQ104.02.58.2 By approximately how many standard deviations do the means differ?
(relevant section & relevant section)

The following question is from the Smiles and Leniency (SL) case study.

 (SL#2) Find the mean, median, standard deviation, and interquartile range for the leniency scores of each of the four groups. (relevant section & relevant section)

The following questions are from the ADHD Treatment (AT) case study.

AQ104.02.60 (AT#4) What is the mean number of correct responses of the participants after taking the placebo (0 mg/kg)? (relevant section)

AQ104.02.61 (AT#7) What are the standard deviation and the interquartile range of the d0 condition? (relevant section)

Week 3: Bivariate Data

AQ104.03.01. Describe the relationship between variables A and C. Think of things these variables could represent in real life. (relevant section)

AQ104.03.02. Make up a data set with 10 numbers that has a positive correlation. (relevant section & relevant section)

AQ104.03.03. Make up a data set with 10 numbers that has a negative correlation. (relevant section & relevant section)

AQ104.03.04. If the correlation between weight (in pounds) and height (in feet) is 0.58, find: AQ104.03.04.1 the correlation between weight (in pounds) and height (in yards); AQ104.03.04.2 the correlation between weight (in kilograms) and height (in meters) (relevant section)

AQ104.03.05. Would you expect the correlation between High School GPA and College GPA to be higher when taken from your entire high school class or when taken from only the top 20 students? Why? (relevant section)

AQ104.03.06. For a certain class, the relationship between the amount of time spent studying and the test grade earned was examined. It was determined that as the amount of time they studied increased, so did their grades. Is this a positive or negative association? (relevant section)

AQ104.03.07. For this same class, the relationship between the amount of time spent studying and the amount of time spent socializing per week was also examined. It was determined that the more hours they spent studying, the fewer hours they spent socializing. Is this a positive or negative association? (relevant section)

AQ104.03.08. For the following data:

A B
2
5
6
8
9
8
5
2
4
1

AQ104.03.08.1 Find the deviation scores for Variable A that correspond to the raw scores of 2 and 8.

AQ104.03.08.2 Find the deviation scores for Variable B that correspond to the raw scores of 5 and 4.

AQ104.03.08.3 Just from looking at these scores, do you think these variable A and B are positively or negatively correlated? AQ104.03.08.4 Why?

AQ104.03.08.5 Now calculate the correlation. Were you right?
(relevant section)

AQ104.03.09. Students took two parts of a test, each worth 50 points. Part A has a variance of 25, and Part B has a variance of 36. The correlation between the test scores is 0.8. AQ104.03.09.1 If the teacher adds the grades of the two parts together to form a final test grade, what would the variance of the final test grades be? AQ104.03.09.2 What would the variance of Part A – Part B be? (relevant section)

AQ104.03.10. True/False: The correlation in real life between height and weight is r=1. (relevant section)

AQ104.03.11. True/False: It is possible for variables to have r=0 but still have a strong association. (relevant section & relevant section)

12. True/False: Two variables with a correlation of 0.3 have a stronger linear relationship than two variables with a correlation of -0.7. (relevant section)

AQ104.03.13. True/False: After polling a certain group of people, researchers found a 0.5 correlation between the number of car accidents per year and the driver’s age. This means that older people get in more accidents. (relevant section)

AQ104.03.14. True/False: The correlation between R and T is the same as the correlation between T and R. (relevant section)

AQ104.03.15. True/False: To examine bivariate data graphically, the best choice is two side by side histograms. (relevant section)

AQ104.03.16. True/False: A correlation of r=1.2 is not possible. (relevant section)

Questions from Case Studies:

The following questions are from the Angry Moods (AM) case study.

AQ104.03.17. What is the correlation between the Control-In and Control-Out scores? (relevant section)

AQ104.03.18.1 Would you expect the correlation between the Anger-Out and Control-Out scores to be positive or negative? AQ104.03.18.2 Compute this correlation. (relevant section & relevant section)

The following question is from the Flatulence (F) case study.

AQ104.03.19.1 Is there a relationship between the number of male siblings and embarrassment in front of romantic interests? AQ104.03.19.2 Create a scatterplot and compute r. (relevant section & relevant section)

The following questions are from the Stroop (S) case study.

AQ104.03.20. Create a scatterplot showing “colors” on the Y-axis and “words” on the X-axis. (relevant section)

AQ104.03.21. Compute the correlation between “colors” and “words.” (relevant section)

AQ104.03.22.1 Sort the data by color-naming time. Choose only the 20 fastest color-namers and create a scatterplot. (relevant section)

AQ104.03.22.2 What is the new correlation? (relevant section)
AQ104.03.22.3 What is the technical term for the finding that this correlation is smaller than the correlation for the full dataset? (relevant section)

The following question is from the Animal Research (AR) case study.

AQ104.03.23. What is the overall correlation between the belief that animal research is wrong and belief that animal research is necessary? (relevant section)

The following question is from the ADHD Treatment (AT) case study.

AQ104.03.24. What is the correlation between the participants’ correct number of responses after taking the placebo and their correct number of responses after taking 0.60 mg/kg of MPH? (relevant section)

Week 4: Probability

 

You may want to use the Binomial Calculator for some of these exercises.

AQ104.04.01. What is the probability of rolling a pair of dice and obtaining a total score of 9 or more?

AQ104.04.02. What is the probability of rolling a pair of dice and obtaining a total score of 7? (relevant section)

AQ104.04.03. A box contains four black pieces of cloth, two striped pieces, and six dotted pieces. A piece is selected randomly and then placed back in the box. A second piece is selected randomly. What is the probability that:

AQ104.04.03.1 both pieces are dotted?

AQ104.04.03.2 the first piece is black and the second piece is dotted?

AQ104.04.03.3 one piece is black and one piece is striped?
(relevant section)

AQ104.04.04. A card is drawn at random from a deck. AQ104.04.04.1 What is the probability that it is an ace or a king? AQ104.04.04.2 What is the probability that it is either a red card or a black card? (relevant section)

AQ104.04.05. The probability that you will win a game is 0.45. AQ104.04.05.1 If you play the game 80 times, what is the most likely number of wins? AQ104.04.05.2 What are the mean and standard deviation of a binomial distribution with π = 0.45 and N = 80? (relevant section)

AQ104.04.06. A fair coin is flipped 9 times. What is the probability of getting exactly 6 heads? (relevant section)

AQ104.04.07. When Susan and Jessica play a card game, Susan wins 60% of the time. If they play 9 games, what is the probability that Jessica will have won more games than Susan? (relevant section)

AQ104.04.08. You flip a coin three times. AQ104.04.08.1 What is the probability of getting heads on only one of your flips? AQ104.04.08.2 What is the probability of getting heads on at least one flip? (relevant section & relevant section)

AQ104.04.09. A test correctly identifies a disease in 95% of people who have it. It correctly identifies no disease in 94% of people who do not have it. In the population, 3% of the people have the disease. What is the probability that you have the disease if you tested positive? (relevant section)

AQ104.04.10. A jar contains 10 blue marbles, 5 red marbles, 4 green marbles, and 1 yellow marble. Two marbles are chosen (without replacement). AQ104.04.10.1 What is the probability that one will be green and the other red? AQ104.04.10.2 What is the probability that one will be blue and the other yellow? (relevant section)

AQ104.04.11. You roll a fair die five times, and you get a 6 each time. What is the probability that you get a 6 on the next roll? (relevant section)

AQ104.04.12. You win a game if you roll a die and get a 2 or a 5. You play this game 60 times.

AQ104.04.12.1 What is the probability that you win between 5 and 10 times (inclusive)?

AQ104.04.12.2 What is the probability that you will win the game at least 15 times?

AQ104.04.12.3 What is the probability that you will win the game at least 40 times?

AQ104.04.12.4 What is the most likely number of wins.

AQ104.04.12.5 What is the probability of obtaining the number of wins in AQ104.04.12.4?
(relevant section)

AQ104.04.13. In a baseball game, Tommy gets a hit 30% of the time when facing this pitcher. Joey gets a hit 25% of the time. They are both coming up to bat this inning.

AQ104.04.13.1 What is the probability that Joey or Tommy (but not both) will get a hit?

AQ104.04.13.2 What is the probability that neither player gets a hit?

AQ104.04.13.3 What is the probability that they both get a hit? (relevant section)

AQ104.04.14. An unfair coin has a probability of coming up heads of 0.65. The coin is flipped 50 times. What is the probability it will come up heads 25 or fewer times? (Give answer to at least 3 decimal places). (relevant section)

AQ104.04.15. You draw two cards from a deck, what is the probability that both of them are face cards (king, queen, or jack)?

AQ104.04.16. What is the probability that you draw two cards from a deck and both of them are hearts? (relevant section)

AQ104.04.1.7 True/False: You are more likely to get a pattern of HTHHHTHTTH than HHHHHHHHTT when you flip a coin 10 times. (relevant section)

AQ104.04.18 True/False: Suppose that at your regular physical exam you test positive for a relatively rare disease. You will need to start taking medicine if you have the disease, so you ask your doctor about the accuracy of the test. It turns out that the test is 98% accurate. The probability that you have Disease X is therefore 0.98 and the probability that you do not have it is .02. (relevant section)

Questions from Case Studies:

The following questions are from the Diet and Health (DH) case study

AQ104.04.19.  (DH#1) What percentage of people on the AHA diet had some sort of illness or death?

AQ104.04.2. What is the probability that if you randomly selected a person on the AHA diet, he or she would have some sort of illness or death? (relevant section)

AQ104.04.21 If 3 people on the AHA diet are chosen at random, what is the probability that they will all be healthy? (relevant section)

AQ104.04.22. (DH#2) What percentage of people on the Mediterranean diet had some sort of illness or death?

AQ104.04.23. What is the probability that if you randomly selected a person on the Mediterranean diet, he or she would have some sort of illness or death? (relevant section)

AQ104.04.24. What is the probability that if you randomly selected a person on the Mediterranean diet, he or she would have cancer? (relevant section)

AQ104.04.25 If you randomly select five people from the Mediterranean diet, what is the probability that they would all be healthy? (relevant section)

Questions AQ104.04.26 to AQ104.04.33 are from


Visit the site

AQ104.04.25. Five faces of a fair die are painted black, and one face is painted white. The die is rolled six times. Which of the following results is more likely?

a. Black side up on five of the rolls; white side up on the other roll
b. Black side up on all six rolls
c. a and b are equally likely

AQ104.04.26. One of the items on the student survey for an introductory statistics course was “Rate your intelligence on a scale of 1 to 10.” The distribution of this variable for the 100 women in the class is presented below. What is the probability of randomly selecting a woman from the class who has an intelligence rating that is LESS than seven (7)?

a. (12 + 24)/100 = .36

b. (12 + 24 + 38)/100 = .74

c. 38/100 = .38

d. (23 + 2 + 1)/100 = .26

e. None of the above.

AQ104.04.27. You roll 2 fair six-sided dice. Which of the following outcomes is most likely to occur on the next roll? A. Getting double 3. B. Getting a 3 and a 4. C. They are equally likely. Explain your choice.

AQ104.04.28. If Tahnee flips a coin 10 times, and records the results (Heads or Tails), which outcome below is more likely to occur, A or B? Explain your choice.

AQ104.04.28. A bowl has 100 wrapped hard candies in it. 20 are yellow, 50 are red, and 30 are blue. They are well mixed up in the bowl. Jenny pulls out a handful of 10 candies, counts the number of reds, and tells her teacher. The teacher writes the number of red candies on a list. Then, Jenny puts the candies back into the bowl, and mixes them all up again. Four of Jenny’s classmates, Jack, Julie, Jason, and Jerry do the same thing. They each pick ten candies, count the reds, and the teacher writes down the number of reds. Then they put the candies back and mix them up again each time. The teacher’s list for the number of reds is most likely to be (please select one):

a. 8,9,7,10,9
b. 3,7,5,8,5
c. 5,5,5,5,5
d. 2,4,3,4,3
e. 3,0,9,2,8

AQ104.04.29. An insurance company writes policies for a large number of newly-licensed drivers each year. Suppose 40% of these are low-risk drivers, 40% are moderate risk, and 20% are high risk. The company has no way to know which group any individual driver falls in when it writes the policies. None of the low-risk drivers will have an at-fault accident in the next year, but 10% of the moderate-risk and 20% of the high-risk drivers will have such an accident. If a driver has an at-fault accident in the next year, what is the probability that he or she is high-risk?

AQ104.04.30. You are to participate in an exam for which you had no chance to study, and for that reason cannot do anything but guess for each question (all questions being of the multiple choice type, so the chance of guessing the correct answer for each question is 1/d, d being the number of options per question; so in case of a 4-choice question, your chance is 0.25). Your instructor offers you the opportunity to choose amongst the following exam formats: I. 6 questions of the 4-choice type; you pass when 5 or more answers are correct; II. 5 questions of the 5-choice type; you pass when 4 or more answers are correct; III. 4 questions of the 10-choice type; you pass when 3 or more answers are correct. Rank the three exam formats according to their attractiveness. It should be clear that the format with the highest probability to pass is the most attractive format. Which would you choose and why?

AQ104.04.31. Consider the question of whether the home team wins more than half of its games in the National Basketball Association. Suppose that you study a simple random sample of 80 professional basketball games and find that 52 of them are won by the home team.

AQ104.04.32. Assuming that there is no home court advantage and that the home team therefore wins 50% of its games in the long run, determine the probability that the home team would win 65% or more of its games in a simple random sample of 80 games.

AQ104.04.33. Does the sample information (that 52 of a random sample of 80 games are won by the home team) provide strong evidence that the home team wins more than half of its games in the long run? Explain.

AQ104.04.33. A refrigerator contains 6 apples, 5 oranges, 10 bananas, 3 pears, 7 peaches, 11 plums, and 2 mangos.

AQ104.04.33.1 Imagine you stick your hand in this refrigerator and pull out a piece of fruit at random. What is the probability that you will pull out a pear?

AQ104.04.33.2 Imagine now that you put your hand in the refrigerator and pull out a piece of fruit. You decide you do not want to eat that fruit so you put it back into the refrigerator and pull out another piece of fruit. What is the probability that the first piece of fruit you pull out is a banana and the second piece you pull out is an apple?

AQ104.04.33.3 What is the probability that you stick your hand in the refrigerator one time and pull out a mango or an orange?

Week 5: Research Design

 

AQ104.05.01. To be a scientific theory, the theory must be potentially ______________.

AQ104.05.02. What is the difference between a faith-based explanation and a scientific explanation?

AQ104.05.03. What does it mean for a theory to be parsimonious?

AQ104.05.04. Define reliability in terms of parallel forms.

AQ104.05.05. Define true score.

AQ104.05.06. What is the reliability if the true score variance is 80 and the test score variance is 100?

AQ104.05.07. What statistic relates to how close a score on one test will be to a score on a parallel form?

AQ104.05.08. What is the effect of test length on the reliability of a test?

AQ104.05.09. Distinguish between predictive validity and construct validity.

AQ104.05.10. What is the theoretical maximum correlation of a test with a criterion if the test has a reliability of .81?

AQ104.05.11. An experiment solicits subjects to participate in a highly stressful experiment. What type of sampling bias is likely to occur?

AQ104.05.12. Give an example of survivorship bias not presented in this text.

AQ104.05.13. Distinguish “between-subject” variables from “within-subjects” variables.

AQ104.05.14. Of the variables “gender” and “trials,” which is likely to be a between-subjects variable and which a within-subjects variable?

AQ104.05.15. Define interaction.

AQ104.05.16. What is counterbalancing used for?

AQ104.05.17. How does randomization deal with the problem of pre-existing differences between groups?

AQ104.05.18. Give an example of the “third variable problem” other than those in this text.

Week 6: Normal Distributions and Advanced Graphs

 

You may want to use the “Calculate Area for a given X” and the “Calculate X for a given Area” applets for some of these exercises.

AQ104.06.01. If scores are normally distributed with a mean of 35 and a standard deviation of 10, what percent of the scores is: AQ104.06.01.1 greater than 34? AQ104.06.01.2 smaller than 42? AQ104.06.01.3 between 28 and 34? (relevant section)

AQ104.06.02.1 What are the mean and standard deviation of the standard normal distribution? AQ104.06.02.2 What would be the mean and standard deviation of a distribution created by multiplying the standard normal distribution by 8 and then adding 75? (relevant section & here)

AQ104.06.03. The normal distribution is defined by two parameters. What are they?
(relevant section)

AQ104.06.04.1 What proportion of a normal distribution is within one standard deviation of the mean? AQ104.06.04.2 What proportion is more than 2.0 standard deviations from the mean? AQ104.06.04.3 What proportion is between 1.25 and 2.1 standard deviations above the mean? (relevant section)

AQ104.06.05.1 A test is normally distributed with a mean of 70 and a standard deviation of 8. What score would be needed to be in the 85th percentile? AQ104.06.05.2 What score would be needed to be in the 22nd percentile? (relevant section)

AQ104.06.06. Assume a normal distribution with a mean of 70 and a standard deviation of 12. What limits would include the middle 65% of the cases? (relevant section)

AQ104.06.07. A normal distribution has a mean of 20 and a standard deviation of 4. Find the Z scores for the following numbers: (relevant section) (a) 28 (b) 18 (c) 10 (d) 23

AQ104.06.08. Assume the speed of vehicles along a stretch of I-10 has an approximately normal distribution with a mean of 71 mph and a standard deviation of 8 mph.

AQ104.06.08.1 The current speed limit is 65 mph. What is the proportion of vehicles less than or equal to the speed limit?

AQ104.06.08.2 What proportion of the vehicles would be going less than 50 mph?

AQ104.06.08.3 A new speed limit will be initiated such that approximately 10% of vehicles will be over the speed limit. What is the new speed limit based on this criterion?

AQ104.06.08.4 In what way do you think the actual distribution of speeds differs from a normal distribution?
(relevant section)

AQ104.06.09. A variable is normally distributed with a mean of 120 and a standard deviation of 5. One score is randomly sampled. What is the probability it is above 127? (relevant section)

AQ104.06.10. You want to use the normal distribution to approximate the binomial distribution. Explain what you need to do to find the probability of obtaining exactly 7 heads out of 12 flips. (relevant section)

AQ104.06.11. A group of students at a school takes a history test. The distribution is normal with a mean of 25, and a standard deviation of 4. AQ104.06.11.1 Everyone who scores in the top 30% of the distribution gets a certificate. What is the lowest score someone can get and still earn a certificate? AQ104.06.11.2 The top 5% of the scores get to compete in a statewide history contest. What is the lowest score someone can get and still go onto compete with the rest of the state? (relevant section)

AQ104.06.12.1 Use the normal distribution to approximate the binomial distribution and find the probability of getting 15 to 18 heads out of 25 flips. AQ104.06.12.2 Compare this to what you get when you calculate the probability using the binomial distribution. Write your answers out to four decimal places. (relevant section & relevant section)

AQ104.06.13. True/false: For any normal distribution, the mean, median, and mode will have the same value. (relevant section)

AQ104.06.14. True/false: In a normal distribution, 11.5% of scores are greater than Z = 1.2. (relevant section)

AQ104.06.15. True/false: The percentile rank for the mean is 50% for any normal distribution. (relevant section)

AQ104.06.16. True/false: The larger the π, the better the normal distribution approximates the binomial distribution. (relevant section & relevant section)

AQ104.06.17. True/false: A Z-score represents the number of standard deviations above or below the mean. (relevant section)

AQ104.06.18. True/false: Abraham de Moivre, a consultant to gamblers, discovered the normal distribution when trying to approximate the binomial distribution to make his computations easier. (relevant section)

AQ104.06.19. True/false: The standard deviation of the blue distribution shown below is about 10. (relevant section)

AQ104.06.20. True/false: In the figure below, the red distribution has a larger standard deviation than the blue distribution. (relevant section)

AQ104.06.21. True/false: The red distribution has more area underneath the curve than the blue distribution does. (relevant section)

Questions from Case Studies:

The following question uses data from the Angry Moods (AM) case study.

AQ104.06.22. For this problem, use the Anger Expression (AE) scores. AQ104.06.22.1 Compute the mean and standard deviation. AQ104.06.22.2 Then, compute what the 25th, 50th and 75th percentiles would be if the distribution were normal. AQ104.06.22.3 Compare the estimates to the actual 25th, 50th, and 75th percentiles. (relevant section)

The following question uses data from the Physicians’ Reactions (PR) case study.

AQ104.06.23. For this problem, use the time spent with the overweight patients. AQ104.06.23.1 Compute the mean and standard deviation of this distribution. AQ104.06.23.2 What is the probability that if you chose an overweight participant at random, the doctor would have spent 31 minutes or longer with this person? AQ104.06.23.3 Now assume this distribution is normal (and has the same mean and standard deviation). Now what is the probability that if you chose an overweight participant at random, the doctor would have spent 31 minutes or longer with this person? (relevant section)

Questions AQ104.06.24. to AQ104.06.30. are from ARTIST

Visit the site

AQ104.06.24. A set of test scores are normally distributed. Their mean is 100 and standard deviation is 20. These scores are converted to standard normal z scores. What would be the mean and median of this distribution?
a.        0
b.        1
c.        50
d.        100

AQ104.06.25. Suppose that weights of bags of potato chips coming from a factory follow a normal distribution with mean 12.8 ounces and standard deviation .6 ounces. If the manufacturer wants to keep the mean at 12.8 ounces but adjust the standard deviation so that only 1% of the bags weigh less than 12 ounces, how small does he/she need to make that standard deviation?

AQ104.06.26. A student received a standardized (z) score on a test that was -. 57. What does this score tell about how this student scored in relation to the rest of the class? Sketch a graph of the normal curve and shade in the appropriate area.

AQ104.06.27. Suppose you take 50 measurements on the speed of cars on Interstate 5, and that these measurements follow roughly a Normal distribution. Do you expect the standard deviation of these 50 measurements to be about 1 mph, 5 mph, 10 mph, or 20 mph? Explain.

AQ104.06.28. Suppose that combined verbal and math SAT scores follow a normal distribution with mean 896 and standard deviation 174. Suppose further that Peter finds out that he scored in the top 3% of SAT scores. Determine how high Peter’s score must have been.

AQ104.06.29. Heights of adult women in the United States are normally distributed with a population mean of μ  = 63.5 inches and a population standard deviation of σ  = 2.5.  A medical researcher is planning to select a large random sample of adult women to participate in a future study. What is the standard value, or z-value, for an adult woman who has a height of 68.5 inches?

AQ104.06.30. An automobile manufacturer introduces a new model that averages 27 miles per gallon in the city. A person who plans to purchase one of these new cars wrote the manufacturer for the details of the tests, and found out that the standard deviation is 3 miles per gallon. Assume that in-city mileage is approximately normally distributed.

AQ104.06.30.1 What is the probability that the person will purchase a car that averages less than 20 miles per gallon for in-city driving?

AQ104.06.30.2 What is the probability that the person will purchase a car that averages between 25 and 29 miles per gallon for in-city driving?

AQ104.06.31. What are Q-Q plots useful for?

AQ104.06.32. For the following data, plot the theoretically expected z score as a function of the actual z score (a Q-Q plot).

0
0
0
0
0
0
0.1
0.1
0.1
0.1
0.1
0.2
0.2
0.3
0.3
0.4
0.5
0.6
0.6
0.6
0.6
0.6
0.6
0.6
0.6
0.6
0.7
0.7
0.8
0.8
0.8
0.8
0.8
0.9
1
1
1.1
1.1
1.2
1.2
1.2
1.2
1.2
1.2
1.3
1.3
1.3
1.3
1.3
1.4
1.4
1.5
1.6
1.7
1.7
1.7
1.8
1.8
1.9
1.9
2
2
2
2.1
2.1
2.1
2.1
2.1
2.1
2.1
2.3
2.5
2.7
3
4.2
5
5.7
12.4
15.2

AQ104.06.33. For the data in AQ104.06.32., describe how the data differ from a normal distribution.

AQ104.06.34. For the “SAT and College GPA” case study data, create a contour plot looking at University GPA as a function of Math SAT and High School GPA. Naturally, you should use a computer to do this.

AQ104.06.35. For the “SAT and College GPA” case study data, create a 3D plot using the variables University GPA, Math SAT, and High School GPA. Naturally, you should use a computer to do this.

Week 7: Sampling Distributions

You may want to use the “r to z’ calculator” and the “Calculate Area for a given X” applet for some of these exercises.

 

You may want to use the “r to z’ calculator” and the “Calculate Area for a given X” applet for some of these exercises.

AQ104.07.01. A population has a mean of 50 and a standard deviation of 6. AQ104.07.01.1 What are the mean and standard deviation of the sampling distribution of the mean for N = 16? AQ104.07.01.2 What are the mean and standard deviation of the sampling distribution of the mean for N = 20? (relevant section)

AQ104.07.02. Given a test that is normally distributed with a mean of 100 and a standard deviation of 12, find:

AQ104.07.02.1 the probability that a single score drawn at random will be greater than 110 (relevant section)
AQ104.07.02.2 the probability that a sample of 25 scores will have a mean greater than 105 (relevant section)
AQ104.07.02.3 the probability that a sample of 64 scores will have a mean greater than 105 (relevant section)
AQ104.07.02.4 the probability that the mean of a sample of 16 scores will be either less than 95 or greater than 105 (relevant section)

AQ104.07.03. What term refers to the standard deviation of the sampling distribution? (relevant section)

AQ104.07.04.1 If the standard error of the mean is 10 for N = 12, what is the standard error of the mean for N = 22? AQ104.07.04.2 If the standard error of the mean is 50 for N = 25, what is it for N = 64? (relevant section)

AQ104.07.05. A questionnaire is developed to assess women’s and men’s attitudes toward using animals in research. One question asks whether animal research is wrong and is answered on a 7-point scale. Assume that in the population, the mean for women is 5, the mean for men is 4, and the standard deviation for both groups is 1.5. Assume the scores are normally distributed. If 12 women and 12 men are selected randomly, what is the probability that the mean of the women will be more than 1.5 points higher than the mean of the men? (relevant section)

AQ104.07.06. If the correlation between reading achievement and math achievement in the population of fifth graders were 0.60, what would be the probability that in a sample of 28 students, the sample correlation coefficient would be greater than 0.65? (relevant section)

AQ104.07.07. If numerous samples of N = 15 are taken from a uniform distribution and a relative frequency distribution of the means is drawn, what would be the shape of the frequency distribution? (relevant section & relevant section)

AQ104.07.08. A normal distribution has a mean of 20 and a standard deviation of 10. Two scores are sampled randomly from the distribution and the second score is subtracted from the first. What is the probability that the difference score will be greater than 5? Hint: Read the Variance Sum Law section of Chapter 3. (relevant section & relevant section)

AQ104.07.09. What is the shape of the sampling distribution of r? In what way does the shape depend on the size of the population correlation? (relevant section)

AQ104.07.10. If you sample one number from a standard normal distribution, what is the probability it will be 0.5? (relevant section & relevant section)

AQ104.07.11. A variable is normally distributed with a mean of 120 and a standard deviation of 5. Four scores are randomly sampled. What is the probability that the mean of the four scores is above 127? (relevant section)

AQ104.07.12. The correlation between self esteem and extraversion is .30. A sample of 84 is taken. (a) What is the probability that the correlation will be less than 0.10? (b) What is the probability that the correlation will be greater than 0.25? (relevant section)

AQ104.07.13. The mean GPA for students in School A is 3.0; the mean GPA for students in School B is 2.8. The standard deviation in both schools is 0.25. The GPAs of both schools are normally distributed. If 9 students are randomly sampled from each school, what is the probability that:

AQ104.07.13.1 the sample mean for School A will exceed that of School B by 0.5 or more? (relevant section)
AQ104.07.13.2 the sample mean for School B will be greater than the sample mean for School A? (relevant section)

AQ104.07.14. In a city, 70% of the people prefer Candidate A. Suppose 30 people from this city were sampled. AQ104.07.14.1 What is the mean of the sampling distribution of p? AQ104.07.14.2 What is the standard error of p? AQ104.07.14.3 What is the probability that 80% or more of this sample will prefer Candidate A? AQ104.07.14.4 What is the probability that 45% or more of this sample will prefer some other candidate? (relevant section)

AQ104.07.15. When solving problems where you need the sampling distribution of r, what is the reason for converting from r to z’? (relevant section)

AQ104.07.16. In the population, the mean SAT score is 1000. Would you be more likely (or equally likely) to get a sample mean of 1200 if you randomly sampled 10 students or if you randomly sampled 30 students? Explain. (relevant section & relevant section)

AQ104.07.17. True/false: The standard error of the mean is smaller when N = 20 than when N = 10. (relevant section)

AQ104.07.18. True/false: The sampling distribution of r = .8 becomes normal as N increases. (relevant section)

AQ104.07.19. True/false: You choose 20 students from the population and calculate the mean of their test scores. You repeat this process 100 times and plot the distribution of the means. In this case, the sample size is 100. (relevant section & relevant section)

AQ104.07.20. True/false: In your school, 40% of students watch TV at night. You randomly ask 5 students every day if they watch TV at night. Every day, you would find that 2 of the 5 do watch TV at night. (relevant section & relevant section)

AQ104.07.21. True/false: The median has a sampling distribution. (relevant section)

AQ104.07.22. True/false: Refer to the figure below. The population distribution is shown in black, and its corresponding sampling distribution of the mean for N = 10 is labeled “A.” (relevant section & relevant section)

     

Questions from Case Studies:

The following questions use data from the Angry Moods (AM) case study.

AQ104.07.23.1 How many men were sampled? AQ104.07.23.2 How many women were sampled?

AQ104.07.24. What is the mean difference between men and women on the Anger-Out scores?

AQ104.07.25. Suppose in the population, the Anger-Out score for men is two points higher than it is for women. The population variances for men and women are both 20. Assume the Anger-Out scores for both genders are normally distributed. Given this information about the population parameters:

AQ104.07.25.1 What is the mean of the sampling distribution of the difference between means? (relevant section)
AQ104.07.25.2 What is the standard error of the difference between means? (relevant section)
AQ104.07.25.3 What is the probability that you would have gotten this mean difference (see AQ104.07.24.) or less in your sample? (relevant section)

The following questions use data from the Animal Research (AR) case study.

AQ104.07.26. How many people were sampled to give their opinions on animal research?

AQ104.07.27. (AR#11) What is the correlation in this sample between the belief that animal research is wrong and belief that animal research is necessary? (Ch. 4.E)

AQ104.07.28. Suppose the correlation between the belief that animal research is wrong and the belief that animal research is necessary is -.68 in the population.

AQ104.07.28.1 Convert -.68 to z’. (relevant section)
AQ104.07.28.2 Find the standard error of this sampling distribution. (relevant section)
AQ104.07.28.3 Assuming the data used in this study was randomly sampled, what is the probability that you would get this correlation or stronger (closer to -1)? (relevant section)

Week 8: Estimation and Hypothesis Testing

You may want to use the Analysis Lab and various calculators for some of these exercises.

Calculators:

Inverse t Distribution: Finds t for a confidence interval.
t Distribution: Computes areas of the t distribution.
Fisher’s r to z’: Computes transformations in both directions.
Inverse Normal Distribution: Use for confidence intervals.

AQ104.08.01. When would the mean grade in a class on a final exam be considered a statistic? When would it be considered a parameter? (relevant section)

AQ104.08.02. Define bias in terms of expected value. (relevant section)

AQ104.08.03. Is it possible for a statistic to be unbiased yet very imprecise? How about being very accurate but biased? (relevant section)

AQ104.08.04. Why is a 99% confidence interval wider than a 95% confidence interval? (relevant section & relevant section)

AQ104.08.05. When you construct a 95% confidence interval, what are you 95% confident about? (relevant section)

AQ104.08.06. What is the difference in the computation of a confidence interval between cases in which you know the population standard deviation and cases in which you have to estimate it? (relevant section & relevant section)

AQ104.08.07. Assume a researcher found that the correlation between a test he or she developed and job performance was 0.55 in a study of 28 employees. If correlations under .35 are considered unacceptable, would you have any reservations about using this test to screen job applicants? (relevant section)

AQ104.08.07. What is the effect of sample size on the width of a confidence interval? (relevant section & relevant section)

AQ104.08.08. How does the t distribution compare with the normal distribution? How does this difference affect the size of confidence intervals constructed using z relative to those constructed using t? Does sample size make a difference? (relevant section)

AQ104.08.09. The effectiveness of a blood-pressure drug is being investigated. How might an experimenter demonstrate that, on average, the reduction in systolic blood pressure is 20 or more? (relevant section & relevant section)

AQ104.08.10. A population is known to be normally distributed with a standard deviation of 2.8. AQ104.08.10.1 Compute the 95% confidence interval on the mean based on the following sample of nine: 8, 9, 10, 13, 14, 16, 17, 20, 21. AQ104.08.10.2 Now compute the 99% confidence interval using the same data. (relevant section)

AQ104.08.11. A person claims to be able to predict the outcome of flipping a coin. This person is correct 16/25 times. AQ104.08.11.1 Compute the 95% confidence interval on the proportion of times this person can predict coin flips correctly. AQ104.08.11.2 What conclusion can you draw about this test of his ability to predict the future? (relevant section)

AQ104.08.12. What does it mean that the variance (computed by dividing by N) is a biased statistic? (relevant section)

AQ104.08.13. A confidence interval for the population mean computed from an N of 16 ranges from 12 to 28. A new sample of 36 observations is going to be taken. You can’t know in advance exactly what the confidence interval will be because it depends on the random sample. Even so, you should have some idea of what it will be. Give your best estimation. (relevant section)

AQ104.08.14. You take a sample of 22 from a population of test scores, and the mean of your sample is 60. AQ104.08.14.1 You know the standard deviation of the population is 10. What is the 99% confidence interval on the population mean. AQ104.08.14.2 Now assume that you do not know the population standard deviation, but the standard deviation in your sample is 10. What is the 99% confidence interval on the mean now? (relevant section)

AQ104.08.15. You read about a survey in a newspaper and find that 70% of the 250 people sampled prefer Candidate A. You are surprised by this survey because you thought that more like 50% of the population preferred this candidate. AQ104.08.15.1 Based on this sample, is 50% a possible population proportion? AQ104.08.15.2 Compute the 95% confidence interval to be sure. (relevant section)

AQ104.08.16. Heights for teenage boys and girls were calculated. The mean height for the sample of 12 boys was 174 cm and the variance was 62. For the sample of 12 girls, the mean was 166 cm and the variance was 65. AQ104.08.16.1 What is the 95% confidence interval on the difference between population means? AQ104.08.16.2 What is the 99% confidence interval on the difference between population means? AQ104.08.16.3 Do you think the mean difference in the population could be about 5? Why or why not? (relevant section)

AQ104.08.17. You were interested in how long the average psychology major at your college studies per night, so you asked 10 psychology majors to tell you the amount they study. They told you the following times: 2, 1.5, 3, 2, 3.5, 1, 0.5, 3, 2, 4. AQ104.08.17.1 Find the 95% confidence interval on the population mean. AQ104.08.17.2 Find the 90% confidence interval on the population mean. (relevant section)

AQ104.08.18. True/false: As the sample size gets larger, the probability that the confidence interval will contain the population mean gets higher. (relevant section & relevant section)

AQ104.08.19. True/false: You have a sample of 9 men and a sample of 8 women. The degrees of freedom for the t value in your confidence interval on the difference between means is 16. (relevant section & relevant section)

AQ104.08.20. True/false: Greek letters are used for statistics as opposed to parameters. (relevant section)

AQ104.08.21. True/false: In order to construct a confidence interval on the difference between means, you need to assume that the populations have the same variance and are both normally distributed. (relevant section)

AQ104.08.22. True/false: The red distribution represents the t distribution and the blue distribution represents the normal distribution. (relevant section)

You may want to use the Binomial Calculator for some of these exercises.

AQ104.08.23. An experiment is conducted to test the claim that James Bond can taste the difference between a Martini that is shaken and one that is stirred. What is the null hypothesis? (relevant section)

AQ104.08.24. The following explanation is incorrect. What three words should be added to make it correct? (relevant section)

AQ104.08.24.1 The probability value is the probability of obtaining a statistic as different from the parameter specified in the null hypothesis as the statistic obtained in the experiment.

AQ104.08.24.2 The probability value is computed assuming that the null hypothesis is true.

AQ104.08.25. Why do experimenters test hypotheses they think are false? (relevant section)

AQ104.08.26. State the null hypothesis for:

AQ104.08.26.1 An experiment testing whether echinacea decreases the length of colds.

AQ104.08.26.2 A correlational study on the relationship between brain size and intelligence.

AQ104.08.26.3 An investigation of whether a self-proclaimed psychic can predict the outcome of a coin flip.

AQ104.08.26.4 A study comparing a drug with a placebo on the amount of pain relief. (A one-tailed test was used.)
(relevant section & relevant section)

AQ104.08.27. Assume the null hypothesis is that µ = 50 and that the graph shown below is the sampling distribution of the mean (M). AQ104.08.27.1 Would a sample value of M= 60 be significant in a two-tailed test at the .05 level? AQ104.08.27.2 Roughly what value of M would be needed to be significant? (relevant section & relevant section)

AQ104.08.28. A researcher develops a new theory that predicts that vegetarians will have more of a particular vitamin in their blood than non-vegetarians. An experiment is conducted and vegetarians do have more of the vitamin, but the difference is not significant. The probability value is 0.13. Should the experimenter’s confidence in the theory increase, decrease, or stay the same? (relevant section)

AQ104.08.28. A researcher hypothesizes that the lowering in cholesterol associated with weight loss is really due to exercise. To test this, the researcher carefully controls for exercise while comparing the cholesterol levels of a group of subjects who lose weight by dieting with a control group that does not diet. The difference between groups in cholesterol is not significant. Can the researcher claim that weight loss has no effect? (relevant section)

A significance test is performed and p = .20. Why can’t the experimenter claim that the probability that the null hypothesis is true is .20? (relevant section, relevant section & relevant section)

AQ104.08.30. For a drug to be approved by the FDA, the drug must be shown to be safe and effective. If the drug is significantly more effective than a placebo, then the drug is deemed effective. What do you know about the effectiveness of a drug once it has been approved by the FDA (assuming that there has not been a Type I error)? (relevant section)

AQ104.08.31.1 When is it valid to use a one-tailed test? AQ104.08.31.2 What is the advantage of a one-tailed test? AQ104.08.31.3 Give an example of a null hypothesis that would be tested by a one-tailed test. (relevant section)

AQ104.08.32. Distinguish between probability value and significance level. (relevant section)

AQ104.08.33. Suppose a study was conducted on the effectiveness of a class on “How to take tests.” The SAT scores of an experimental group and a control group were compared. (There were 100 subjects in each group.) The mean score of the experimental group was 503 and the mean score of the control group was 499. The difference between means was found to be significant, p = .037. What do you conclude about the effectiveness of the class? (relevant section & relevant section)

AQ104.08.34.1 Is it more conservative to use an alpha level of .01 or an alpha level of .05? AQ104.08.34.2 Would beta be higher for an alpha of .05 or for an alpha of .01? (relevant section)

AQ104.08.35. Why is “Ho: “M1 = M2” not a proper null hypothesis? (relevant section)

AQ104.08.36.1 An experimenter expects an effect to come out in a certain direction. Is this sufficient basis for using a one-tailed test? AQ104.08.36.2 Why or why not? (relevant section)

AQ104.08.37. How do the Type I and Type II error rates of one-tailed and two-tailed tests differ? (relevant section & relevant section)

AQ104.08.38.1 A two-tailed probability is .03. What is the one-tailed probability if the effect were in the specified direction? AQ104.08.38.2 What would it be if the effect were in the other direction? (relevant section)

AQ104.08.39. You choose an alpha level of .01 and then analyze your data. AQ104.08.39.1 What is the probability that you will make a Type I error given that the null hypothesis is true? AQ104.08.39.1 What is the probability that you will make a Type I error given that the null hypothesis is false? (relevant section)

AQ104.08.40. Why doesn’t it make sense to test the hypothesis that the sample mean is 42? (relevant section & relevant section)

AQ104.08.41. True/false: It is easier to reject the null hypothesis if the researcher uses a smaller alpha (α) level. (relevant section & relevant section)

AQ104.08.42. True/false: You are more likely to make a Type I error when using a small sample than when using a large sample. (relevant section)

AQ104.08.43. True/false: You accept the alternative hypothesis when you reject the null hypothesis. (relevant section)

AQ104.08.44. True/false: You do not accept the null hypothesis when you fail to reject it. (relevant section)

AQ104.08.45. True/false: A researcher risks making a Type I error any time the null hypothesis is rejected. (relevant section)

Questions from Case Studies:

The following questions are from the Angry Moods (AM) case study.

AQ104.08.46. Is there a difference in how much males and females use aggressive behavior to improve an angry mood? For the “Anger-Out” scores, compute a 99% confidence interval on the difference between gender means. (relevant section)

AQ104.08.47. Calculate the 95% confidence interval for the difference between the mean Anger-In score for the athletes and non-athletes. What can you conclude? (relevant section)

AQ104.08.48. Find the 95% confidence interval on the population correlation between the Anger-Out and Control-Out scores. (relevant section)

The following questions are from the Flatulence (F) case study.

AQ104.08.49. Compare men and women on the variable “perday.” Compute the 95% confidence interval on the difference between means. (relevant section)

AQ104.08.50. What is the 95% confidence interval of the mean time people wait before farting in front of a romantic partner. (relevant section)

The following questions use data from the Animal Research (AR) case study.

AQ104.08.51. What percentage of the women studied in this sample strongly agreed (gave a rating of 7) that using animals for research is wrong?

AQ104.08.52. Use the proportion you computed in AQ104.08.51. Compute the 95% confidence interval on the population proportion of women who strongly agree that animal research is wrong. (relevant section)

AQ104.08.53. Compute a 95% confidence interval on the difference between the gender means with respect to their beliefs that animal research is wrong. (relevant section)
The following question is from the ADHD Treatment (AT) case study.

AQ104.08.54. What is the correlation between the participants’ correct number of responses after taking the placebo and their correct number of responses after taking 0.60 mg/kg of MPH? Compute the 95% confidence interval on the population correlation. (relevant section)

The following question is from the Weapons and Aggression (WA) case study.

AQ104.08.55. Recall that the hypothesis is that a person can name an aggressive word more quickly if it is preceded by a weapon word prime than if it is preceded by a neutral word prime. The first step in testing this hypothesis is to compute the difference between (a) the naming time of aggressive words when preceded by a neutral word prime and (b) the naming time of aggressive words when preceded by a weapon word prime separately for each of the 32 participants. That is, compute an – aw for each participant.

AQ104.08.56. Would the hypothesis of this study be supported if the difference were positive or if it were negative?

AQ104.08.57. What is the mean of this difference score? (relevant section)

AQ104.08.58. What is the standard deviation of this difference score? (relevant section)

AQ104.08.59. What is the 95% confidence interval of the mean difference score? (relevant section)

AQ104.08.60. What does the confidence interval computed in (d) say about the hypothesis.

The following question is from the Diet and Health (WA) case study.

AQ104.08.61. Compute a 95% confidence interval on the proportion of people who are healthy on the AHA diet.

Cancers
Deaths
Nonfatal illness
Healthy
Total
AHA
15
24
25
239
303
Mediterranean
7
14
8
273
302
Total
22
38
33
512
605

The following questions are from

Visit the site

AQ104.08.62. Suppose that you take a random sample of 10,000 Americans and find that 1,111 are left-handed. You perform a test of significance to assess whether the sample data provide evidence that more than 10% of all Americans are left-handed, and you calculate a test statistic of 3.70 and a p-value of .0001. Furthermore, you calculate a 99% confidence interval for the proportion of left-handers in America to be (.103,.119). Consider the following statements: The sample provides strong evidence that more than 10% of all Americans are left-handed. The sample provides evidence that the proportion of left-handers in America is much larger than 10%. Which of these two statements is the more appropriate conclusion to draw? Explain your answer based on the results of the significance test and confidence interval.

AQ104.08.63. A student wanted to study the ages of couples applying for marriage licenses in his county. He studied a sample of 94 marriage licenses and found that in 67 cases the husband was older than the wife. Do the sample data provide strong evidence that the husband is usually older than the wife among couples applying for marriage licenses in that county? Explain briefly and justify your answer.

AQ104.08.64. Imagine that there are 100 different researchers each studying the sleeping habits of college freshmen. Each researcher takes a random sample of size 50 from the same population of freshmen. Each researcher is trying to estimate the mean hours of sleep that freshmen get at night, and each one constructs a 95% confidence interval for the mean. Approximately how many of these 100 confidence intervals will NOT capture the true mean?

a. None

b. 1 or 2

c. 3 to 7

d. about half

e. 95 to 100

f. other

Week 9: Tests of Means and Power

AQ104.09.01. Define power in your own words.

AQ104.09.02. List 3 measures one can take to increase the power of an experiment. Explain why your measures result in greater power.

AQ104.09.03. Population 1 mean = 36; Population 2 mean = 45; Both population standard deviations are 10;  Sample size (per group) 16. AQ104.09.03.1 What is the probability that a t test will find a significant difference between means at the 0.05 level? AQ104.09.03.2 Give results for both one- and two-tailed tests. Hint: the power of a one-tailed test at 0.05 level is the power of a two-tailed test at 0.10.

AQ104.09.04. Rank order the following in terms of power. n is the sample size per group.

Population 1 Mean
n
Population 2 Mean
Standard Deviation
a 29 20 43 12
b 34 15 40 6
c 105 24 50 27
d 170 2 120 10

AQ104.09.05. Alan, while snooping around his grandmother’s basement stumbled upon a shiny object protruding from under a stack of boxes . When he reached for the object a genie miraculously materialized and stated: “You have found my magic coin. If you flip this coin an infinite number of times you will notice that heads will show 60% of the time.” Soon after the genie’s declaration he vanished, never to be seen again. Alan, excited about his new magical discovery, approached his friend Ken and told him about what he had found. Ken was skeptical of his friend’s story, however, he told Alan to flip the coin 100 times and to record how many flips resulted with heads. AQ104.09.05.2 What is the probability that Alan will be able convince Ken that his coin has special powers by finding a p value below 0.05 (one tailed). Use the Binomial Calculator (and some trial and error) AQ104.09.05.2 If Ken told Alan to flip the coin only 20 times, what is the probability that Alan will not be able to convince Ken (by failing to reject the null hypothesis at the 0.05 level)?

AQ104.09.06. The scores of a random sample of 8 students on a physics test are as follows: 60, 62, 67, 69, 70, 72, 75, and 78.

AQ104.09.06.1 Test to see if the sample mean is significantly different from 65 at the .05 level. Report the t and p values.

AQ104.09.06.2 The researcher realizes that she accidentally recorded the score that should have been 76 as 67. Are these corrected scores significantly different from 65 at the .05 level? (relevant section)

AQ104.09.07. A (hypothetical) experiment is conducted on the effect of alcohol on perceptual motor ability. Ten subjects are each tested twice, once after having two drinks and once after having two glasses of water. The two tests were on two different days to give the alcohol a chance to wear off. Half of the subjects were given alcohol first and half were given water first. The scores of the 10 subjects are shown below. The first number for each subject is their performance in the “water” condition. Higher scores reflect better performance. Test to see if alcohol had a significant effect. Report the t and p values. (relevant section)

water

alcohol

16

13

15

13

11

10

20

18

19

17

14

11

13

10

15

15

14

11

16

16

AQ104.09.07. The scores on a (hypothetical) vocabulary test of a group of 20 year olds and a group of 60 year olds are shown below.

20 yr olds

60 yr olds

27

26

26

29

21

29

24

29

15

27

18

16

17

20

12

27

13

AQ104.09.07.1 Test the mean difference for significance using the .05 level. (relevant section).

AQ104.09.07.2 List the assumptions made in computing your answer.(relevant section)

AQ104.09.08. The sampling distribution of a statistic is normally distributed with an estimated standard error of 12 (df = 20). AQ104.09.08.1 What is the probability that you would have gotten a mean of 107 (or more extreme) if the population parameter were 100? Is this probability significant at the .05 level (two-tailed)? AQ104.09.08.2 What is the probability that you would have gotten a mean of 95 or less (one-tailed)? AQ104.09.08.3 Is this probability significant at the .05 level? You may want to use the t Distribution calculator for this problem. (relevant section)

AQ104.09.09. How do you decide whether to use an independent groups t test or a correlated t test (test of dependent means)? relevant section & (relevant section)

AQ104.09.10. An experiment compared the ability of three groups of subjects to remember briefly-presented chess positions. The data are shown below.

Non-players

Beginners

Tournament players

22.1

32.5

40.1

22.3

37.1

45.6

26.2

39.1

51.2

29.6

40.5

56.4

31.7

45.5

58.1

33.5

51.3

71.1

38.9

52.6

74.9

39.7

55.7

75.9

43.2

55.9

80.3

43.2

57.7

85.3

AQ104.09.10.1 Using the Tukey HSD procedure, determine which groups are significantly different from each other at the .05 level. (relevant section)

AQ104.09.10.2 Now compare each pair of groups using t-tests. Make sure to control for the familywise error rate (at 0.05) by using the Bonferroni correction. Specify the alpha level you used.

AQ104.09.11. Below are data showing the results of six subjects on a memory test. The three scores per subject are their scores on three trials (a, b, and c) of a memory task.

AQ104.09.11.1 Are the subjects getting better each trial?

AQ104.09.11.2 Test the linear effect of trial for the data.

a

b

c

4

6

7

3

7

8

2

8

5

1

4

7

4

6

9

2

4

2

AQ104.09.11.3 Compute L for each subject using the contrast weights -1, 0, and 1. That is, compute (-1)(a) + (0)(b) + (1)(c) for each subject.

AQ104.09.11.4 Compute a one-sample t-test on this column (with the L values for each subject) you created. (relevant section)

AQ104.09.12. Participants threw darts at a target. In one condition, they used their preferred hand; in the other condition, they used their other hand. All subjects performed in both conditions (the order of conditions was counterbalanced). Their scores are shown below.

Preferred

Non-preferred

12

7

7

9

11

8

13

10

10

9

AQ104.09.12.1 Which kind of t-test should be used?

AQ104.09.12.2 Calculate the two-tailed t and p values using this t test.

AQ104.09.12.3 Calculate the one-tailed t and p values using this t test.

AQ104.09.13. Assume the data in the previous problem were collected using two different groups of subjects: One group used their preferred hand and the other group used their non-preferred hand. Analyze the data and compare the results to those for the previous problem (relevant section)

AQ104.09.14. You have 4 means, and you want to compare each mean to every other mean. AQ104.09.14.1 How many tests total are you going to compute? AQ104.09.14.2 What would be the chance of making at least one Type I error if the Type I error for each test was .05 and the tests were independent? (relevant section & relevant section ) AQ104.09.14.3 Are the tests independent and how does independence/non-independence affect the probability in AQ104.09.14.2.

AQ104.09.15. In an experiment, participants were divided into 4 groups. There were 20 participants in each group, so the degrees of freedom (error) for this study was 80 – 4 = 76. Tukey’s HSD test was performed on the data. AQ104.09.15.1 Calculate the p value for each pair based on the Q value given below. You will want to use the Studentized Range Calculator. AQ104.09.15.2 Which differences are significant at the .05 level? (relevant section

Comparison of Groups

Q

A – B

3.4

A – C

3.8

A – D

4.3

B – C

1.7

B – D

3.9

C – D

3.7

AQ104.09.16. If you have 5 groups in your study, why shouldn’t you just compute a t test of each group mean with each other group mean? (relevant section)

AQ104.09.17. You are conducting a study to see if students do better when they study all at once or in intervals. One group of 12 participants took a test after studying for one hour continuously. The other group of 12 participants took a test after studying for three twenty minute sessions. The first group had a mean score of 75 and a variance of 120. The second group had a mean score of 86 and a variance of 100.

AQ104.09.17.1 What is the calculated t value? Are the mean test scores of these two groups significantly different at the .05 level?

AQ104.09.17..2 What would the t value be if there were only 6 participants in each group? Would the scores be significant at the .05 level?

AQ104.09.17.3 A new test was designed to have a mean of 80 and a standard deviation of 10. A random sample of 20 students at your school take the test, and the mean score turns out to be 85. Does this score differ significantly from 80? To answer this problem, you may want to use the Normal Distribution Calculator. (relevant section)

AQ104.09.18. You perform a one-sample t test and calculate a t statistic of 3.0. The mean of your sample was 1.3 and the standard deviation was 2.6. How many participants were used in this study? (relevant section)

AQ104.09.19. True/false: The contrasts (-3, 1 1 1) and (0, 0 , -1, 1) are orthogonal. (relevant section)

AQ104.09.20. True/false: If you are making 4 comparisons between means, then based on the Bonferroni correction, you should use an alpha level of .01 for each test. (relevant section)

AQ104.09.21. True/false: Correlated t tests almost always have greater power than independent t tests. (relevant section)

AQ104.09.22. True/false:The graph below represents a violation of the homogeneity of variance assumption. relevant section)

AQ104.09.23. True/false: When you are conducting a one-sample t test and you know the population standard deviation, you look up the critical t value in the table based on the degrees of freedom. (relevant section)

Questions from Case Studies:

The following questions use data from the Angry Moods (AM) case study.

AQ104.09.24. Do athletes or non-athletes calm down more when angry? Conduct a t test to see if the difference between groups in Control-In scores is statistically significant.

AQ104.09.25. Do people in general have a higher Anger-Out or Anger-In score?

AQ104.09.26. Conduct a t test on the difference between means of these two scores.

AQ104.09.27. Are these two means independent or dependent? (relevant section)

The following questions use data from the Smiles and Leniency (SL) case study.

AQ104.09.28. Compare each mean to the neutral mean. Be sure to control for the familywise error rate. (relevant section)

AQ104.09.29. Does a “felt smile” lead to more leniency than other types of smiles? AQ104.09.29.1 Calculate L (the linear combination) using the following contrast weights false: -1, felt: 2, miserable: -1, neutral: 0. AQ104.09.29.2 Perform a significance test on this value of L. (relevant section)

The following questions are from the Animal Research (AR) case study.

AQ104.09.30. Conduct an independent samples t test comparing males to females on the belief that animal research is necessary. relevant section)

AQ104.09.31.1 Based on the t test you conducted in the previous problem, are you able to reject the null hypothesis if alpha = 0.05? AQ104.09.31.2 What about if alpha = 0.1? relevant section

AQ104.09.32.  Is there any evidence that the t test assumption of homogeneity of variance is violated in the t test you computed in AQ104.09.30? (relevant section)

The following questions use data from the ADHD Treatment (AT) case study.

AQ104.09.33. Compare each dosage with the dosage below it (compare d0 and d15, d15 and d30, and d30 and d60). Remember that the patients completed the task after every dosage. AQ104.09.33.1 If the familywise error rate is .05, what is the alpha level you will use for each comparison when doing the Bonferroni correction? AQ104.09.33.2 Which differences are significant at this level? (relevant section)

AQ104.09.34. Does performance increase linearly with dosage? Plot a line graph of this data.

AQ104.09.34.1 Compute L for each patient. To do this, create a new variable where you multiply the following coefficients by their corresponding dosages and then sum up the total: (-3)d0 + (-1)d15 + (1)d30 + (3)d60 (see #8). AQ104.09.34.2 What is the mean of L?

AQ104.09.33. Perform a significance test on L. Compute the 95% confidence interval for L. (relevant section)

AQ104.09.34. The scores of a random sample of 8 students on a physics test are as follows: 60, 62, 67, 69, 70, 72, 75, and 78. Test to see if the sample mean is significantly different from 65 at the .05 level. Report the t and p values.

AQ104.09.35. The researcher realizes that she accidentally recorded the score that should have been 76 as 67. Are these corrected scores significantly different from 65 at the .05 level? (relevant section)

AQ104.09.36. A (hypothetical) experiment is conducted on the effect of alcohol on perceptual motor ability. Ten subjects are each tested twice, once after having two drinks and once after having two glasses of water. The two tests were on two different days to give the alcohol a chance to wear off. Half of the subjects were given alcohol first and half were given water first. The scores of the 10 subjects are shown below. The first number for each subject is their performance in the “water” condition. Higher scores reflect better performance. Test to see if alcohol had a significant effect. Report the t and p values. (relevant section)

water
alcohol
16
13
15
13
11
10
20
18
19
17
14
11
13
10
15
15
14
11
16
16

AQ104.09.37. The scores on a (hypothetical) vocabulary test of a group of 20 year olds and a group of 60 year olds are shown below.

20 yr olds
60 yr olds
27
26
26
29
21
29
24
29
15
27
18
16
17
20
12
27
13

AQ104.09.37.1 Test the mean difference for significance using the .05 level. (relevant section).

AQ104.09.37.2 List the assumptions made in computing your answer.(relevant section)

AQ104.09.38. The sampling distribution of a statistic is normally distributed with an estimated standard error of 12 (df = 20). AQ104.09.38.1 What is the probability that you would have gotten a mean of 107 (or more extreme) if the population parameter were 100? Is this probability significant at the .05 level (two-tailed)? AQ104.09.38.2 What is the probability that you would have gotten a mean of 95 or less (one-tailed)? AQ104.09.38.3 Is this probability significant at the .05 level? You may want to use the t Distribution calculator for this problem. (relevant section)

AQ104.09.39. How do you decide whether to use an independent groups t test or a correlated t test (test of dependent means)? relevant section & (relevant section)

AQ104.09.40. An experiment compared the ability of three groups of subjects to remember briefly-presented chess positions. The data are shown below.

Non-players
Beginners
Tournament players
22.1
32.5
40.1
22.3
37.1
45.6
26.2
39.1
51.2
29.6
40.5
56.4
31.7
45.5
58.1
33.5
51.3
71.1
38.9
52.6
74.9
39.7
55.7
75.9
43.2
55.9
80.3
43.2
57.7
85.3

AQ104.09.40.1 Using the Tukey HSD procedure, determine which groups are significantly different from each other at the .05 level. (relevant section)

AQ104.09.40.2 Now compare each pair of groups using t-tests. Make sure to control for the familywise error rate (at 0.05) by using the Bonferroni correction. Specify the alpha level you used.

AQ104.09.41. Below are data showing the results of six subjects on a memory test. The three scores per subject are their scores on three trials (a, b, and c) of a memory task. Are the subjects getting better each trial? Test the linear effect of trial for the data.

a
b
c
4
6
7
3
7
8
2
8
5
1
4
7
4
6
9
2
4
2

AQ104.09.41.1 Compute L for each subject using the contrast weights -1, 0, and 1. That is, compute (-1)(a) + (0)(b) + (1)(c) for each subject.

AQ104.09.41.2 Compute a one-sample t-test on this column (with the L values for each subject) you created. (relevant section)

AQ104.09.42. Participants threw darts at a target. In one condition, they used their preferred hand; in the other condition, they used their other hand. All subjects performed in both conditions (the order of conditions was counterbalanced). Their scores are shown below.

Preferred
Non-preferred
12
7
7
9
11
8
13
10
10
9

AQ104.09.42.1 Which kind of t-test should be used?

AQ104.09.42.2 Calculate the two-tailed t and p values using this t test.

AQ104.09.42.3 Calculate the one-tailed t and p values using this t test.

AQ104.09.43. Assume the data in the previous problem were collected using two different groups of subjects: One group used their preferred hand and the other group used their non-preferred hand. Analyze the data and compare the results to those for the previous problem (relevant section)

AQ104.09.44. You have 4 means, and you want to compare each mean to every other mean. AQ104.09.44.1 How many tests total are you going to compute? AQ104.09.44.2 What would be the chance of making at least one Type I error if the Type I error for each test was .05 and the tests were independent? (relevant section & relevant section ) AQ104.09.44.3 Are the tests independent and how does independence/non-independence affect the probability in AQ104.09.44.2?

AQ104.09.45. In an experiment, participants were divided into 4 groups. There were 20 participants in each group, so the degrees of freedom (error) for this study was 80 – 4 = 76. Tukey’s HSD test was performed on the data. AQ104.09.45.1 Calculate the p value for each pair based on the Q value given below. You will want to use the Studentized Range Calculator. AQ104.09.45.2 Which differences are significant at the .05 level? (relevant section

Comparison of Groups
Q
A – B
3.4
A – C
3.8
A – D
4.3
B – C
1.7
B – D
3.9
C – D
3.7

AQ104.09.46. If you have 5 groups in your study, why shouldn’t you just compute a t test of each group mean with each other group mean? (relevant section)

AQ104.09.47. You are conducting a study to see if students do better when they study all at once or in intervals. One group of 12 participants took a test after studying for one hour continuously. The other group of 12 participants took a test after studying for three twenty minute sessions. The first group had a mean score of 75 and a variance of 120. The second group had a mean score of 86 and a variance of 100.

AQ104.09.47.1 What is the calculated t value? Are the mean test scores of these two groups significantly different at the .05 level?

AQ104.09.47.2 What would the t value be if there were only 6 participants in each group? Would the scores be significant at the .05 level?

AQ104.09.48. A new test was designed to have a mean of 80 and a standard deviation of 10. A random sample of 20 students at your school take the test, and the mean score turns out to be 85. Does this score differ significantly from 80? To answer this problem, you may want to use the Normal Distribution Calculator.(relevant section)

AQ104.09.49. You perform a one-sample t test and calculate a t statistic of 3.0. The mean of your sample was 1.3 and the standard deviation was 2.6. How many participants were used in this study? (relevant section)

AQ104.09.50. True/false: The contrasts (-3, 1 1 1) and (0, 0 , -1, 1) are orthogonal. (relevant section)

AQ104.09.51. True/false: If you are making 4 comparisons between means, then based on the Bonferroni correction, you should use an alpha level of .01 for each test. (relevant section)

AQ104.09.52. True/false: Correlated t tests almost always have greater power than independent t tests. (relevant section)

AQ104.09.53. True/false: The graph below represents a violation of the homogeneity of variance assumption. relevant section)

AQ104.09.54. True/false: When you are conducting a one-sample t test and you know the population standard deviation, you look up the critical t value in the table based on the degrees of freedom. (relevant section)

Questions from Case Studies:

The following questions use data from the Angry Moods (AM) case study.

AQ104.09.55. Do athletes or non-athletes calm down more when angry? Conduct a t test to see if the difference between groups in Control-In scores is statistically significant.

AQ104.09.56. Do people in general have a higher Anger-Out or Anger-In score? AQ104.09.56.1 Conduct a t test on the difference between means of these two scores. AQ104.09.56.2 Are these two means independent or dependent? (relevant section)

The following questions use data from the Smiles and Leniency (SL) case study.

AQ104.09.57. Compare each mean to the neutral mean. Be sure to control for the familywise error rate. (relevant section)

AQ104.09.58. Does a “felt smile” lead to more leniency than other types of smiles? AQ104.09.58.1 Calculate L (the linear combination) using the following contrast weights false: -1, felt: 2, miserable: -1, neutral: 0. AQ104.09.58.2 Perform a significance test on this value of L. (relevant section)

The following questions are from the Animal Research (AR) case study.

AQ104.09.59. Conduct an independent samples t test comparing males to females on the belief that animal research is necessary. (relevant section)

AQ104.09.60.1 Based on the t test you conducted in the previous problem, are you able to reject the null hypothesis if alpha = 0.05? AQ104.09.60.2 What about if alpha = 0.1? relevant section)

AQ104.09.61. Is there any evidence that the t test assumption of homogeneity of variance is violated in the t test you computed in AQ104.09.59? (relevant section)

The following questions use data from the ADHD Treatment (AT) case study.

AQ104.09.62. Compare each dosage with the dosage below it (compare d0 and d15, d15 and d30, and d30 and d60). Remember that the patients completed the task after every dosage. AQ104.09.62.1 If the familywise error rate is .05, what is the alpha level you will use for each comparison when doing the Bonferroni correction? AQ104.09.62.2 Which differences are significant at this level? (relevant section)

AQ104.09.63. Does performance increase linearly with dosage? Plot a line graph of this data.

AQ104.09.64.1 Compute L for each patient. To do this, create a new variable where you multiply the following coefficients by their corresponding dosages and then sum up the total: (-3)d0 + (-1)d15 + (1)d30 + (3)d60 (see AQ104.09.62.). AQ104.09.64.2 What is the mean of L?

AQ104.09.65. Perform a significance test on L. Compute the 95% confidence interval for L. (relevant section)

Week 10: Regression

AQ104.10.01.1 What is the equation for a regression line? AQ104.10.01.2 What does each term in the line refer to? (relevant section)

AQ104.10.02. The formula for a regression equation based on a sample size of 25 observations is Y’ = 2X + 9. AQ104.10.02.1 What would be the predicted score for a person scoring 6 on X? AQ104.10.02.2 If someone’s predicted score was 14, what was this person’s score on X? (relevant section)

AQ104.10.03. What criterion is used for deciding which regression line fits best? (relevant section)

AQ104.10.04.1 What does the standard error of the estimate measure? AQ104.10.04.2 What is the formula for the standard error of the estimate? (relevant section)

AQ104.10.05.1 In a regression analysis, the sum of squares for the predicted scores is 100 and the sum of squares error is 200, what is R2? AQ104.10.05.2 In a different regression analysis, 40% of the variance was explained. The sum of squares total is 1000. AQ104.10.05.3 What is the sum of squares of the predicted values? (relevant section)

AQ104.10.06. For the X,Y data below, compute:

AQ104.10.06.1 r and determine if it is significantly different from zero.
AQ104.10.06.2 the slope of the regression line and test if it differs significantly from zero.
AQ104.10.06.3 the 95% confidence interval for the slope.
(relevant section)

X
Y
2
5
4
6
4
7
5
11
6
12

AQ104.10.07. What assumptions are needed to calculate the various inferential statistics of linear regression? (relevant section)

AQ104.10.08. The correlation between years of education and salary in a sample of 20 people from a certain company is .4. Is this correlation statistically significant at the .05 level? (relevant section)

AQ104.10.09. A sample of X and Y scores is taken, and a regression line is used to predict Y from X. If SSY’ = 300, SSE = 500, and N = 50, what is: (relevant section relevant section)

AQ104.10.09.1 SSY?
AQ104.10.09.2 the standard error of the estimate?
AQ104.10.09.3 R2?

AQ104.10.10. Using linear regression, find the predicted post-test score for someone with a score of 43 on the pre-test. (relevant section)

Pre Post
59 56
52 63
44 55
51 50
42 66
42 48
41 58
45 36
27 13
63 50
54 81
44 56
50 64
47 50
55 63
49 57
45 73
57 63
46 46
60 60
65 47
64 73
50 58
74 85
59 44

AQ104.10.11. The equation for a regression line predicting the number of hours of TV watched by children (Y) from the number of hours of TV watched by their parents (X) is Y’ = 4 + 1.2X. The sample size is 12.

AQ104.10.11.1 If the standard error of b is .4, is the slope statistically significant at the .05 level? (relevant section)
AQ104.10.11.2 If the mean of X is 8, what is the mean of Y? (relevant section)

AQ104.10.12. Based on the table below, compute the regression line that predicts Y from X. (relevant section)

MX
MY
sX sY r
10
12
2.5 3.0 -0.6

AQ104.10.13. Does A or B have a larger standard error of the estimate? (relevant section)

AQ104.10.14. True/false: If the slope of a simple linear regression line is statistically significant, then the correlation will also always be significant. (relevant section)

AQ104.10.15.1 True/false: If the slope of the relationship between X and Y is larger for Population 1 than for Population 2, the correlation will necessarily be larger in Population 1 than in Population 2. AQ104.10.15.2 Why or why not? (relevant section)

AQ104.10.16. True/false: If the correlation is .8, then 40% of the variance is explained. (relevant section)

AQ104.10.17. True/false: If the actual Y score was 31, but the predicted score was 28, then the error of prediction is 3. (relevant section)

Questions from Case Studies:

The following question is from the Angry Moods (AM) case study.

AQ104.10.18. Find the regression line for predicting Anger-Out from Control-Out.

AQ104.10.18.1 What is the slope?
AQ104.10.18.2 What is the intercept?
AQ104.10.18.3 Is the relationship at least approximately linear?
AQ104.10.18.4 Test to see if the slope is significantly different from 0.
AQ104.10.18.5 What is the standard error of the estimate?
(relevant section, relevant section, relevant section)

The following question is from the SAT and GPA (SG) case study.

AQ104.10.19. Find the regression line for predicting the overall university GPA from the high school GPA.

AQ104.10.19.1 What is the slope?
AQ104.10.19.2 What is the y-intercept?
AQ104.10.19.3 If someone had a 2.2 GPA in high school, what is the best estimate of his or her college GPA?
AQ104.10.19.4 If someone had a 4.0 GPA in high school, what is the best estimate of his or her college GPA?
(relevant section)

The following questions are from the Driving (D) case study.

AQ104.10.20.1 What is the correlation between age and how often the person chooses to drive in inclement weather? AQ104.10.20.2 Is this correlation statistically significant at the .01 level? AQ104.10.20.3 Are older people more or less likely to report that they drive in inclement weather? (relevant section, relevant section )

AQ104.10.21.1 What is the correlation between how often a person chooses to drive in inclement weather and the percentage of accidents the person believes occur in inclement weather? AQ104.10.21.2 Is this correlation significantly different from 0? (relevant section, relevant section )

AQ104.10.22. (D#10) Use linear regression to predict how often someone rides public transportation in inclement weather from what percentage of accidents that person thinks occur in inclement weather. (Pubtran by Accident)

AQ104.10.22.1 Create a scatter plot of this data and add a regression line.
AQ104.10.22.2 What is the slope?
AQ104.10.22.3 What is the intercept?
AQ104.10.22.4 Is the relationship at least approximately linear?
AQ104.10.22.5 Test if the slope is significantly different from 0.
AQ104.10.22.6 Comment on possible assumption violations for the test of the slope.
AQ104.10.22.7 What is the standard error of the estimate?
(relevant section, relevant section, relevant section)

Week 11: Analysis of Variance

 

AQ104.11.01. What is the null hypothesis tested by analysis of variance?

AQ104.11.02. What are the assumptions of between-subjects analysis of variance?

AQ104.11.03. What is a between-subjects variable?

AQ104.11.04. Why not just compute t-tests among all pairs of means instead computing an analysis of variance?

AQ104.11.05. What is the difference between “N” and “n”?

AQ104.11.06. How is it that estimates of variance can be used to test a hypothesis about means?

AQ104.11.07. Explain why the variance of the sample means has to be multiplied by “n” in the computation of MSB.

AQ104.11.08. What kind of skew does the F distribution have?

AQ104.11.09. When do MSB and MSE estimate the same quantity?

AQ104.11.10. If an experiment is conducted with 6 conditions and 5 subjects in each condition, what are dfn and dfe?

AQ104.11.11. How is the shape of the F distribution affected by the degrees of freedom?

AQ104.11.12. What are the two components of the total sum of squares in a one-factor between-subjects design?

AQ104.11.13. How is the mean square computed from the sum of squares?

AQ104.11.14. An experimenter is interested in the effects of two independent variables on self esteem. What is better about conducting a factorial experiment than conducting two separate experiements, one for each independent variable?

AQ104.11.15. An experiment is conducted on the effect of age and treatment condition (experimental versus control) on reading speed. Which statistical term (main effect, simple effect, interaction, specific comparison) applies to each of the descriptions of effects?

a. The effect of the treatment was larger for 15-year olds than it was for 5- or 10-year olds.

b. Overall, subjects in the treatment condition performed faster than subjects in the control condition.

c. The difference between the 10- and 15-year olds was significant under the treatment condition.

d. The difference between the 15- year olds and the average of the 5- and 10-year olds was significant.

e. As they grow older, children read faster.

AQ104.11.16. An A(3) x B(4) factorial design with 6 subjects in each group is analyzed. Give the source and degrees of freedom columns of the analysis of variance summary table.

AQ104.11.17. The following data are from a hypothetical study on the effects of age and time on scores on a test of reading comprehension. Compute the analysis of variance summary table.

12-year olds 16-year olds
30 minutes 66
68
59
72
46
74
71
67
82
76
60 minutes 69
61
69
73
61
95
92
95
98
94

AQ104.11.17.1 Define “Three-way interaction”

AQ104.11.17.2 Define interaction in terms of simple effects.

AQ104.11.17.3 Plot an interaction for an A(2) x B(2) design in which the effect of B is greater at A1 than it is at A2. The dependent variable is “Number correct.” Make sure to label both axes.

AQ104.11.18. Following are two graphs of population means for 2 x 3 designs. For each graph, indicate which effect(s) (A, B, or A x B) are nonzero.

AQ104.11.19. The following data are from an A(2) x B(4) factorial design.

B1 B2 B3 B4
A1 1
3
4
5
2
2
4
5
3
4
2
6
4
5
6
8
A2 1
1
2
2
2
3
2
4
4
6
7
8
8
9
9
8

AQ104.11.19.1 Compute an analysis of variance.

AQ104.11.19.2 Test differences among the four levels of B using the Bonferroni correction.

AQ104.11.19.3 Test the linear component of trend for the effect of B.

AQ104.11.19.4 Plot the interaction.

AQ104.11.19.5 Describe the interaction in words.

AQ104.11.20. Why are within-subjects designs usually more powerful than between-subjects design?

AQ104.11.21.1 What source of variation is found in an ANOVA summary table for a within-subjects design that is not in in an ANOVA summary table for a between-subjects design. AQ104.11.21.2 What happens to this source of variation in a between-subjects design?

AQ104.11.22. The following data contain three scores from each of five subjects. The three scores per subject are their scores on three trials of a memory task.

4 6 7
3 7 7
2 8 5
1 4 7
4 6 9

AQ104.11.22.1 Compute an ANOVA

AQ104.11.22.2 Test all pairwise differences between means using the Bonferroni test at the .01 level.

AQ104.11.22.3 Test the linear and quadratic components of trend for these data.

AQ104.11.23. Give the source and df columns of the ANOVA summary table for the following experiments:

AQ104.11.23.1 Twenty two subjects are each tested on a simple reaction time task and on a choice reaction time task.

AQ104.11.23.2 Twelve male and 12 female subjects are each tested under three levels
of drug dosage: 0 mg, 10 mg, and 20 mg.

AQ104.11.23.3 Twenty subjects are tested on a motor learning task for three trials a day for two days.

AQ104.11.23.4 An experiment is conducted in which depressed people are either assigned to a drug therapy group, a behavioral therapy group, or a control group. Ten subjects are assigned to each group. The level of measured once a month for four months.

Questions from Case Studies:

The following question is from the Stroop Interference case study.

AQ104.11.24. The dataset has the scores (times) for males and females on each of three tasks.

AQ104.11.24.1 Do a Gender (2) x Task (3) analysis of variance.
AQ104.11.24.2 Plot the interaction.

The following question is from the ADHD Treatment case study.

AQ104.11.25. The data has four scores per subject.

AQ104.11.25.1 Is the design between-subjects or within-subjects?

AQ104.11.25.2 Create an ANOVA summary table.

The following question is from the Angry Moods case study.

AQ104.11.26.1 Using the Anger Expression Index as the dependent variable, perform a 2×2 ANOVA with gender and sports participation as the two factors. AQ104.11.26.2 Do athletes and non-athletes differ significantly in how much anger they express? AQ104.11.26.3 Do the genders differ significantly in Anger Expression Index? AQ104.11.26.4 Is the effect of sports participation significantly different for the two genders?

The following question is from the Weapons and Aggression case study.

AQ104.11.27. Compute a 2×2 ANOVA on this data with the following two factors: prime type (was the first word a weapon or not?) and word type (was the second word aggressive or non-aggressive?). Consider carefully whether the variables are between-subject or within-subects variables.

The following question is from the Smiles and Leniency case study.

AQ104.11.28. Compute the ANOVA summary table.

Week 12: Transformation, Chi Square, Distribution Free Tests, and Effect Size

 

AQ104.12.01. When is a log transformation valuable?

AQ104.12.02. If the arithmetic mean of log10 transformed data were 3, what would be the geometric mean?

AQ104.12.03. Using Tukey’s ladder of transformation, transform the following data using a λ of 0.5: 9, 16, 25

AQ104.12.04. What value of λ in Tukey’s ladder decreases skew the most?

AQ104.12.05. What value of λ in Tukey’s ladder increases skew the most?

AQ104.12.06. In the ADHD case study, transform the data in the placebo condition (D0) with λ’s of .5, 0, -.5, and -1. How does the skew in each of these compare to the skew in the raw data. Which transformation leads to the least skew?

AQ104.12.07. Which of the two Chi Square distributions shown below (A or B) has the larger degrees of freedom? How do you know? (relevant section)

AQ104.12.07. Twelve subjects were each given two flavors of ice cream to taste and then were asked whether they liked them. Two of the subjects liked the first flavor and nine of them liked the second flavor. AQ104.12.07.1 Is it valid to use the Chi Square test to determine whether this difference in proportions is significant? AQ104.12.07.2 Why or why not? (relevant section)

AQ104.12.08. A die is suspected of being biased. It is rolled 25 times with the following result:

Outcome
Frequency
1
9
2
4
3
1
4
8
5
3
6
0

AQ104.12.08. Conduct a significance test to see if the die is biased. AQ104.12.08.1 What Chi Square value do you get and how many degrees of freedom does it have? AQ104.12.08.2 What is the p value? (relevant section)

AQ104.12.09. A recent experiment investigated the relationship between smoking and urinary incontinence. Of the 322 subjects in the study who were incontinent, 113 were smokers, 51 were former smokers, and 158 had never smoked. Of the 284 control subjects who were not incontinent, 68 were smokers, 23 were former smokers, and 193 had never smoked. AQ104.12.09.1 Create a table displaying this data. AQ104.12.09.2 What is the expected frequency in each cell? Conduct a significance test to see if there is a relationship between smoking and incontinence. AQ104.12.09.3 What Chi Square value do you get? AQ104.12.09.4 What p value do you get? AQ104.12.09.5  What do you conclude? (relevant section)

AQ104.12.10. At a school pep rally, a group of sophomore students organized a free raffle for prizes. They claim that they put the names of all of the students in the school in the basket and that they randomly drew 36 names out of this basket. Of the prize winners, 6 were freshmen, 14 were sophomores, 9 were juniors, and 7 were seniors. The results do not seem that random to you. You think it is a little fishy that sophomores organized the raffle and also won the most prizes. Your school is composed of 30% freshmen, 25% sophomores, 25% juniors, and 20% seniors. AQ104.12.10.1 What are the expected frequencies of winners from each class? AQ104.12.10.2 Conduct a significance test to determine whether the winners of the prizes were distributed throughout the classes as would be expected based on the percentage of students in each group. Report your Chi Square and p values. AQ104.12.10.3 What do you conclude? (relevant section)

AQ104.12.11. Some parents of the West Bay little leaguers think that they are noticing a pattern. There seems to be a relationship between the number on the kids’ jerseys and their position. These parents decide to record what they see. The hypothetical data appear below. Conduct a Chi Square test to determine if the parents’ suspicion that there is a relationship between jersey number and position is right. Report your Chi Square and p values. (relevant section)

Infield
Outfield
Pitcher
Total
0-9
12
5
5
22
10-19
5
10
2
17
20+
4
4
7
15
Total
21
19
14
54

AQ104.12.12. True/false: A Chi Square distribution with 2 df has a larger mean than a Chi Square distribution with 12 df. (relevant section)

AQ104.12.13. True/false: A Chi Square test is often used to determine if there is a significant relationship between two continuous variables. (relevant section)

AQ104.12.14. True/false: Imagine that you want to determine if the spinner shown below is biased. You spin it 50 times and write down how many times the arrow lands in each section. You will reject the null hypothesis at the .05 level and determine that this spinner is biased if you calculate a Chi Square value of 7.82 or higher. (relevant section)

Questions from Case Studies:

The following question uses data from the SAT and GPA (SG) case study.

AQ104.12.15. Answer these items to determine if the math SAT scores are normally distributed. You may want to first standardize the scores. (relevant section)

AQ104.12.15.1 If these data were normally distributed, how many scores would you expect there to be in each of these brackets: (i) smaller than 1 SD below the mean, (ii) in between the mean and 1 SD below the mean, (iii) in between the mean and 1 SD above the mean, (iv) greater than 1 SD above the mean?

AQ104.12.15.2 How many scores are actually in each of these brackets?

AQ104.12.15.3 Conduct a Chi Square test to determine if the math SAT scores are normally distributed based on these expected and observed frequencies. (relevant section)The following questions are from the Diet and Health (DH) case study.

AQ104.12.16. Conduct a Pearson Chi Square test to determine if there is any relationship between diet and outcome. Report the Chi Square and p values and state your conclusions. (relevant section)

The following questions are from ARTIST.
Visit the site

AQ104.12.17. A study compared members of a medical clinic who filed complaints with a random sample of members who did not complain. The study divided the complainers into two subgroups: those who filed complaints about medical treatment and those who filed nonmedical complaints. Here are the data on the total number in each group and the number who voluntarily left the medical clinic. Set up a two-way table. Analyze these data to see if there is a relationship between complaint (no, yes – medical, yes – nonmedical) and leaving the clinic (yes or no).

AQ104.12.18. Imagine that you believe there is a relationship between a person’s eye color and where he or she prefers to sit in a large lecture hall. You decide to collect data from a random sample of individuals and conduct a chi-square test of independence. What would your two-way table look like? Use the information to construct such a table, and be sure to label the different levels of each category.

AQ104.12.19. A geologist collects hand-specimen sized pieces of limestone from a particular area. A qualitative assessment of both texture and color is made with the following results. Is there evidence of association between color and texture for these limestones? Explain your answer.

AQ104.12.20. Suppose that college students are asked to identify their preferences in political affiliation (Democrat, Republican, or Independent) and in ice cream (chocolate, vanilla, or strawberry). Suppose that their responses are represented in the following two-way table (with some of the totals left for you to calculate).

AQ104.12.20.1 What proportion of the respondents prefer chocolate ice cream?

AQ104.12.20.2 What proportion of the respondents are Independents?

AQ104.12.20.3 What proportion of Independents prefer chocolate ice cream?

AQ104.12.20.4 What proportion of those who prefer chocolate ice cream are Independents?

AQ104.12.20.5 Analyze the data to determine if there is a relationship between political party preference and ice cream preference.

AQ104.12.21. NCAA collected data on graduation rates of athletes in Division I in the mid-1980s. Among 2,332 men, 1,343 had not graduated from college, and among 959 women, 441 had not graduated.

AQ104.12.21.1 Set up a two-way table to examine the relationship between gender and graduation.

AQ104.12.21.2 Identify a test procedure that would be appropriate for analyzing the relationship between gender and graduation. Carry out the procedure and state your conclusion.

AQ104.12.22. For the following data, how many ways could the data be arranged (including the original arrangement) so that the advantage of the Experimental Group mean over the Control Group mean is as large or larger then the original arrangement?

Experimental Control
5
10
15
16
17
1
2
3
4
9

AQ104.12.23. For the data in Problem AQ104.12.22, how many ways can the data be rearranged?

AQ104.12.24. What is the one-tailed probability for a test of the difference?

AQ104.12.25. For the following data, how many ways can the data be rearranged?

T1 T2 Control
7
8
11
14
19
21
0
2
5

AQ104.12.26. In general, are rank randomization tests or randomization tests more powerful?

AQ104.12.27. What is the advantage of rank randomization tests over randomization tests?

AQ104.12.28. Test whether the differences among conditions for the data in Problem 1 is significant (one tailed) at the .01 level using a rank randomization test.

Questions from Case Studies:

The following question uses data from the SAT and GPA case study.

AQ104.12.29. Compute Spearman’s ρ for the relationship between UGPA and SAT.

The following question uses data from the Stereograms case study.

AQ104.12.30. Test the difference in central tendency between the two conditions using a rank-randomization test (with the normal approximation) with a one-tailed test. Give the Z and the p.

The following question uses data from the Smiles and Leniency case study.

AQ104.12.31. Test the difference in central tendency between the four conditions using a rank-randomization test (with the normal approximation). Give the Chi Square and the p.

AQ104.12.32. If the probability of a disease is .34 without treatment and .22 with treatment then what is the

AQ104.12.32.1 absolute risk reduction
AQ104.12.32.2 relative risk reduction
AQ104.12.32.3 Odds ratio
AQ104.12.32.4 Number needed to treat

AQ104.12.33. When is it meaningful to compute the proportional difference between means?

AQ104.12.34. The mean for an experimental group is 12, the mean for the control group were 8, the MSE from the ANOVA is 16, and N, the number of observations is 20, compute g and d.

AQ104.12.35. Two experiments investigated the same variables but one of the experiment had subject who differed greatly from each other whereas the subjects in the other experiment were relatively homogeneous. Which experiment would likely have the larger value of g?

AQ104.12.36. Why is ω2 preferable to η2?

AQ104.12.37. What is the difference between η2 and partial η2?

The following questions are from the Teacher Ratings case study.

AQ104.12.38. What are the values of d and g?

AQ104.12.39. What are the values of ω2 and η2?

The following question is from the Smiles and Leniency case study.

AQ104.12.40. What are the values of ω2 and η2?

The following question is from the Obesity and Bias case study.

AQ104.12.41. For compute ω2 and partial ω2 for the effect of “Weight” in a “Weight x Relatedness” ANOVA.

Page created by: Ian Clark, last modified on 22 May 2016.

Image: Scope, at http://www.scope-mr.ch/en/services/methods/, accessed 10 March 2016.