Chapter 22 - Comparing Two Proportions 1. For this example, we assume that 45% of infants with a treatment similar to the Abecedarian project will enroll in college compared to 20% in the control group. For the sampling distribution of all differences, the mean, , of all differences is the difference of the means . To answer this question, we need to see how much variation we can expect in random samples if there is no difference in the rate that serious health problems occur, so we use the sampling distribution of differences in sample proportions. (a) Describe the shape of the sampling distribution of and justify your answer. A quality control manager takes separate random samples of 150 150 cars from each plant. a. to analyze and see if there is a difference between paired scores 48. assumptions of paired samples t-test a. Sample size two proportions - Sample size two proportions is a software program that supports students solve math problems. Compute a statistic/metric of the drawn sample in Step 1 and save it. Suppose that 20 of the Wal-Mart employees and 35 of the other employees have insurance through their employer. Thus, the sample statistic is p boy - p girl = 0.40 - 0.30 = 0.10. This is a proportion of 0.00003. . The following is an excerpt from a press release on the AFL-CIO website published in October of 2003. Lesson 18: Inference for Two Proportions - GitHub Pages A simulation is needed for this activity. Click here to open it in its own window. In Inference for One Proportion, we learned to estimate and test hypotheses regarding the value of a single population proportion. Accessibility StatementFor more information contact us atinfo@libretexts.orgor check out our status page at https://status.libretexts.org. The value z* is the appropriate value from the standard normal distribution for your desired confidence level. A normal model is a good fit for the sampling distribution if the number of expected successes and failures in each sample are all at least 10. 4.4.2 - StatKey: Percentile Method | STAT 200 3 0 obj Regardless of shape, the mean of the distribution of sample differences is the difference between the population proportions, p1 p2. Comparing two groups of percentages - is a t-test ok? Shape: A normal model is a good fit for the . Consider random samples of size 100 taken from the distribution . { "9.01:_Why_It_Matters-_Inference_for_Two_Proportions" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.02:_Assignment-_A_Statistical_Investigation_using_Software" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.03:_Introduction_to_Distribution_of_Differences_in_Sample_Proportions" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.04:_Distribution_of_Differences_in_Sample_Proportions_(1_of_5)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.05:_Distribution_of_Differences_in_Sample_Proportions_(2_of_5)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.06:_Distribution_of_Differences_in_Sample_Proportions_(3_of_5)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.07:_Distribution_of_Differences_in_Sample_Proportions_(4_of_5)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.08:_Distribution_of_Differences_in_Sample_Proportions_(5_of_5)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.09:_Introduction_to_Estimate_the_Difference_Between_Population_Proportions" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.10:_Estimate_the_Difference_between_Population_Proportions_(1_of_3)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.11:_Estimate_the_Difference_between_Population_Proportions_(2_of_3)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.12:_Estimate_the_Difference_between_Population_Proportions_(3_of_3)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.13:_Introduction_to_Hypothesis_Test_for_Difference_in_Two_Population_Proportions" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.14:_Hypothesis_Test_for_Difference_in_Two_Population_Proportions_(1_of_6)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.15:_Hypothesis_Test_for_Difference_in_Two_Population_Proportions_(2_of_6)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.16:_Hypothesis_Test_for_Difference_in_Two_Population_Proportions_(3_of_6)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.17:_Hypothesis_Test_for_Difference_in_Two_Population_Proportions_(4_of_6)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.18:_Hypothesis_Test_for_Difference_in_Two_Population_Proportions_(5_of_6)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.19:_Hypothesis_Test_for_Difference_in_Two_Population_Proportions_(6_of_6)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.20:_Putting_It_Together-_Inference_for_Two_Proportions" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()" }, { "00:_Front_Matter" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "01:_Types_of_Statistical_Studies_and_Producing_Data" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "02:_Summarizing_Data_Graphically_and_Numerically" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "03:_Examining_Relationships-_Quantitative_Data" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "04:_Nonlinear_Models" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "05:_Relationships_in_Categorical_Data_with_Intro_to_Probability" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "06:_Probability_and_Probability_Distributions" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "07:_Linking_Probability_to_Statistical_Inference" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "08:_Inference_for_One_Proportion" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "09:_Inference_for_Two_Proportions" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "10:_Inference_for_Means" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "11:_Chi-Square_Tests" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "12:_Appendix" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "zz:_Back_Matter" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()" }, 9.4: Distribution of Differences in Sample Proportions (1 of 5), https://stats.libretexts.org/@app/auth/3/login?returnto=https%3A%2F%2Fstats.libretexts.org%2FCourses%2FLumen_Learning%2FBook%253A_Concepts_in_Statistics_(Lumen)%2F09%253A_Inference_for_Two_Proportions%2F9.04%253A_Distribution_of_Differences_in_Sample_Proportions_(1_of_5), \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\). ( ) n p p p p s d p p 1 2 p p Ex: 2 drugs, cure rates of 60% and 65%, what It is one of an important . Notice that we are sampling from populations with assumed parameter values, but we are investigating the difference in population proportions. We also need to understand how the center and spread of the sampling distribution relates to the population proportions. That is, we assume that a high-quality prechool experience will produce a 25% increase in college enrollment. The samples are independent. In other words, it's a numerical value that represents standard deviation of the sampling distribution of a statistic for sample mean x or proportion p, difference between two sample means (x 1 - x 2) or proportions (p 1 - p 2) (using either standard deviation or p value) in statistical surveys & experiments. Putting It Together: Inference for Two Proportions 120 seconds. The Christchurch Health and Development Study (Fergusson, D. M., and L. J. Horwood, The Christchurch Health and Development Study: Review of Findings on Child and Adolescent Mental Health, Australian and New Zealand Journal of Psychiatry 35[3]:287296), which began in 1977, suggests that the proportion of depressed females between ages 13 and 18 years is as high as 26%, compared to only 10% for males in the same age group. The graph will show a normal distribution, and the center will be the mean of the sampling distribution, which is the mean of the entire . Difference Between Proportions - Stat Trek A success is just what we are counting.). where and are the means of the two samples, is the hypothesized difference between the population means (0 if testing for equal means), 1 and 2 are the standard deviations of the two populations, and n 1 and n 2 are the sizes of the two samples. The formula for the z-score is similar to the formulas for z-scores we learned previously. We want to create a mathematical model of the sampling distribution, so we need to understand when we can use a normal curve. PDF Unit 25 Hypothesis Tests about Proportions We write this with symbols as follows: pf pm = 0.140.08 =0.06 p f p m = 0.14 0.08 = 0.06. We also acknowledge previous National Science Foundation support under grant numbers 1246120, 1525057, and 1413739. ), https://assessments.lumenlearning.cosessments/3625, https://assessments.lumenlearning.cosessments/3626. endobj Here the female proportion is 2.6 times the size of the male proportion (0.26/0.10 = 2.6). h[o0[M/ However, before introducing more hypothesis tests, we shall consider a type of statistical analysis which Graphically, we can compare these proportion using side-by-side ribbon charts: To compare these proportions, we could describe how many times larger one proportion is than the other. Notice the relationship between standard errors: Because many patients stay in the hospital for considerably more days, the distribution of length of stay is strongly skewed to the right. Later we investigate whether larger samples will change our conclusion. 5 0 obj The mean difference is the difference between the population proportions: The standard deviation of the difference is: This standard deviation formula is exactly correct as long as we have: *If we're sampling without replacement, this formula will actually overestimate the standard deviation, but it's extremely close to correct as long as each sample is less than. endobj This probability is based on random samples of 70 in the treatment group and 100 in the control group. This is a test of two population proportions. . Or could the survey results have come from populations with a 0.16 difference in depression rates? Accessibility StatementFor more information contact us atinfo@libretexts.orgor check out our status page at https://status.libretexts.org. Johnston Community College . For a difference in sample proportions, the z-score formula is shown below. We discuss conditions for use of a normal model later. Its not about the values its about how they are related! These conditions translate into the following statement: The number of expected successes and failures in both samples must be at least 10. Since we add these terms, the standard error of differences is always larger than the standard error in the sampling distributions of individual proportions. (d) How would the sampling distribution of change if the sample size, n , were increased from I then compute the difference in proportions, repeat this process 10,000 times, and then find the standard deviation of the resulting distribution of differences. <>/Font<>/ProcSet[/PDF/Text/ImageB/ImageC/ImageI] >>/MediaBox[ 0 0 720 540] /Contents 4 0 R/Group<>/Tabs/S/StructParents 0>> The following formula gives us a confidence interval for the difference of two population proportions: (p 1 - p 2) +/- z* [ p 1 (1 - p 1 )/ n1 + p 2 (1 - p 2 )/ n2.] 6.1 Point Estimation and Sampling Distributions Is the rate of similar health problems any different for those who dont receive the vaccine? Conclusion: If there is a 25% treatment effect with the Abecedarian treatment, then about 8% of the time we will see a treatment effect of less than 15%. Section 6: Difference of Two Proportions Sampling distribution of the difference of 2 proportions The difference of 2 sample proportions can be modeled using a normal distribution when certain conditions are met Independence condition: the data is independent within and between the 2 groups Usually satisfied if the data comes from 2 independent . than .60 (or less than .6429.) . Draw conclusions about a difference in population proportions from a simulation. In "Distributions of Differences in Sample Proportions," we compared two population proportions by subtracting. We cannot make judgments about whether the female and male depression rates are 0.26 and 0.10 respectively. 11 0 obj A USA Today article, No Evidence HPV Vaccines Are Dangerous (September 19, 2011), described two studies by the Centers for Disease Control and Prevention (CDC) that track the safety of the vaccine. endobj hTOO |9j. Let M and F be the subscripts for males and females. Distribution of Differences in Sample Proportions (5 of 5) <> Sampling Distribution - Definition, Statistics, Types, Examples B and C would remain the same since 60 > 30, so the sampling distribution of sample means is normal, and the equations for the mean and standard deviation are valid. A student conducting a study plans on taking separate random samples of 100 100 students and 20 20 professors. Unlike the paired t-test, the 2-sample t-test requires independent groups for each sample. endobj We compare these distributions in the following table. Formulas =nA/nB is the matching ratio is the standard Normal . In this article, we'll practice applying what we've learned about sampling distributions for the differences in sample proportions to calculate probabilities of various sample results. The variance of all differences, , is the sum of the variances, . 9.4: Distribution of Differences in Sample Proportions (1 of 5) . In other words, there is more variability in the differences. But without a normal model, we cant say how unusual it is or state the probability of this difference occurring. But some people carry the burden for weeks, months, or even years. If a normal model is a good fit, we can calculate z-scores and find probabilities as we did in Modules 6, 7, and 8. T-distribution. endobj Gender gap. Suppose simple random samples size n 1 and n 2 are taken from two populations. <>/XObject<>/ProcSet[/PDF/Text/ImageB/ImageC/ImageI] >>/MediaBox[ 0 0 612 792] /Contents 4 0 R/Group<>/Tabs/S/StructParents 0>> The LibreTexts libraries arePowered by NICE CXone Expertand are supported by the Department of Education Open Textbook Pilot Project, the UC Davis Office of the Provost, the UC Davis Library, the California State University Affordable Learning Solutions Program, and Merlot. This is equivalent to about 4 more cases of serious health problems in 100,000. 9'rj6YktxtqJ$lapeM-m$&PZcjxZ`{ f `uf(+HkTb+R Sampling Distributions | Statistics Quiz - Quizizz This is always true if we look at the long-run behavior of the differences in sample proportions. groups come from the same population. If the shape is skewed right or left, the . Sampling distribution: The frequency distribution of a sample statistic (aka metric) over many samples drawn from the dataset[1]. 4 0 obj Use this calculator to determine the appropriate sample size for detecting a difference between two proportions. Find the sample proportion. The expectation of a sample proportion or average is the corresponding population value. Now we focus on the conditions for use of a normal model for the sampling distribution of differences in sample proportions. In other words, assume that these values are both population proportions. When we select independent random samples from the two populations, the sampling distribution of the difference between two sample proportions has the following shape, center, and spread.