SAS & R Statistics 2 questions

Math312Fall 2022
HW06
1. 10.1.3
2. 10.2.9
STATISTICS FOR RESEARCH
THIRD EDITION
WILEY SERIES IN PROBABILITY AND STATISTICS
Established by WALTER A. SHEWHART and SAMUEL S. WILKS
Editors: David J. Balding, Noel A. C. Cressie, Nicholas I. Fisher,
Iain M. Johnstone, J. B. Kadane, Louise M. Ryan, David W. Scott,
Adrian F. M. Smith, Jozef L. Teugels
Editors Emeriti: Vic Barnett, J. Stuart Hunter, David G. Kendall
A complete list of the titles in this series appears at the end of this volume.
STATISTICS FOR
RESEARCH
THIRD EDITION
Shirley Dowdy
Stanley Weardon
West Virginia University
Department of Statistics and Computer Science
Morgantown, WV
Daniel Chilko
West Virginia University
Department of Statistics and Computer Science
Morgantown, WV
A JOHN WILEY & SONS, INC. PUBLICATION
This book is printed on acid-free paper.
Copyright # 2004 by John Wiley & Sons, Inc., Hoboken, New Jersey. All rights reserved.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form
or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as
permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the
prior written permission of the Publisher, or authorization through payment of the appropriate
pre-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978)
750-8400, fax (978) 750-4744. Requests to the Publisher for permission should be addressed to the
Permissions Department, John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158-0012,
(212) 850-6011, fax (212) 850-6008, E-Mail: PERMREQ @ WILEY.COM.
For ordering and customer service, call 1-800-CALL-WILEY.
Library of Congress Cataloging-in-Publication Data:
Dowdy, S. M.
Statistics for research / Shirley Dowdy, Stanley Weardon, Daniel Chilko.
p. cm. – (Wiley series in probability and statistics; 1345)
Includes bibliographical references and index.
ISBN 0-471-26735-X (cloth : acid-free paper)
1. Mathematical statistics. I. Wearden, Stanley, 1926– II. Chilko, Daniel M. III. Title. IV. Series.
QA276.D66 2003
519.5–dc21
2003053485
Printed in the United States of America.
10 9 8 7 6 5 4 3 2 1
CONTENTS
Preface to the Third Edition
Preface to the Second Edition
Preface to the First Edition
1 The Role of Statistics
1.1 The Basic Statistical Procedure
1.2 The Scientiﬁc Method
1.3 Experimental Data and Survey Data
1.4 Computer Usage
Review Exercises
Selected Readings
2 Populations, Samples, and Probability Distributions
2.1 Populations and Samples
2.2 Random Sampling
2.3 Levels of Measurement
2.4 Random Variables and Probability Distributions
2.5 Expected Value and Variance of a Probability Distribution
Review Exercises
Selected Readings
3 Binomial Distributions
3.1 The Nature of Binomial Distributions
3.2 Testing Hypotheses
3.3 Estimation
3.4 Nonparametric Statistics: Median Test
Review Exercises
Selected Readings
4 Poisson Distributions
4.1 The Nature of Poisson Distributions
4.2 Testing Hypotheses
4.3 Estimation
4.4 Poisson Distributions and Binomial Distributions
Review Exercises
Selected Readings
ix
xiii
xv
1
1
11
19
20
21
22
25
25
27
30
33
39
47
47
49
49
59
70
77
78
80
81
81
84
87
90
93
94
v
vi
CONTENTS
5 Chi-Square Distributions
5.1 The Nature of Chi-Square Distributions
5.2 Goodness-of-Fit Tests
5.3 Contingency Table Analysis
5.4 Relative Risks and Odds Ratios
5.5 Nonparametric Statistics: Median Test for Several Samples
Review Exercises
Selected Readings
6 Sampling Distribution of Averages
6.1 Population Mean and Sample Average
6.2 Population Variance and Sample Variance
6.3 The Mean and Variance of the Sampling Distribution of Averages
6.4 Sampling Without Replacement
Review Exercises
7 Normal Distributions
7.1 The Standard Normal Distribution
7.2 Inference From a Single Observation
7.3 The Central Limit Theorem
7.4 Inferences About a Population Mean and Variance
7.5 Using a Normal Distribution to Approximate Other Distributions
7.6 Nonparametric Statistics: A Test Based on Ranks
Review Exercises
Selected Readings
8 Student’s t Distribution
8.1 The Nature of t Distributions
8.2 Inference About a Single Mean
8.3 Inference About Two Means
8.4 Inference About Two Variances
8.5 Nonparametric Statistics: Matched-Pair and Two-Sample Rank Tests
Review Exercises
Selected Readings
9 Distributions of Two Variables
9.1 Simple Linear Regression
9.2 Model Testing
9.3 Inferences Related to Regression
9.4 Correlation
9.5 Nonparametric Statistics: Rank Correlation
9.6 Computer Usage
9.7 Estimating Only One Linear Trend Parameter
Review Exercises
Selected Readings
95
95
104
108
117
121
124
125
127
127
132
138
143
144
147
147
152
155
157
164
173
176
177
179
179
182
190
197
204
209
210
211
211
223
233
238
250
253
256
262
263
CONTENTS
10 Techniques for One-way Analysis of Variance
10.1 The Additive Model
10.2 One-Way Analysis-of-Variance Procedure
10.3 Multiple-Comparison Procedures
10.4 One-Degree-of-Freedom Comparisons
10.5 Estimation
10.6 Bonferroni Procedures
10.7 Nonparametric Statistics: Kruskal–Wallis ANOVA for Ranks
Review Exercises
Selected Readings
11 The Analysis-of-Variance Model
11.1 Random Effects and Fixed Effects
11.2 Testing the Assumptions for ANOVA
11.3 Transformations
Review Exercises
Selected Readings
12 Other Analysis-of-Variance Designs
12.1 Nested Design
12.2 Randomized Complete Block Design
12.3 Latin Square Design
12.4 a b Factorial Design
12.5 a b c Factorial Design
12.6 Split-Plot Design
12.7 Split Plot with Repeated Measures
Review Exercises
Selected Readings
13 Analysis of Covariance
13.1 Combining Regression with ANOVA
13.2 One-Way Analysis of Covariance
13.3 Testing the Assumptions for Analysis of Covariance
13.4 Multiple-Comparison Procedures
Review Exercises
Selected Readings
14 Multiple Regression and Correlation
14.1
14.2
14.3
14.4
14.5
14.6
14.7
Matrix Procedures
ANOVA Procedures for Multiple Regression and Correlation
Inferences About Effects of Independent Variables
Computer Usage
Model Fitting
Logarithmic Transformations
Polynomial Regression
vii
265
265
272
283
294
300
303
309
313
314
317
317
324
329
337
338
341
341
350
360
368
376
387
398
407
408
409
409
413
418
423
428
429
431
431
439
444
451
458
475
484
viii
CONTENTS
14.8 Logistic Regression
Review Exercises
Selected Readings
495
507
508
Appendix of Useful Tables
511
Answers to Most Odd-Numbered Exercises and All Review Exercises
603
Index
629
PREFACE TO THE THIRD EDITION
In preparation for the third edition, we sent an electronic mail questionnaire to every statistics
department in the United States with a graduate program. We wanted modal opinion on what
statistical procedures should be addressed in a statistical methods course in the twenty-ﬁrst
century. Our ﬁndings can readily be summarized as a seeming contradiction. The course has
changed little since R. A. Fisher published the inaugural text in 1925, but it also has changed
greatly since then. The goals, procedures, and statistical inference needed for good research
remain unchanged, but the nearly universal availability of personal computers and statistical
computing application packages make it possible, almost daily, to do more than ever before.
The role of the computer in teaching statistical methods is a problem Fisher never had to face,
but today’s instructor must face it, fortunately without having to make an all-or-none choice.
We have always promised to avoid the black-box concept of computer analysis by
showing the actual arithmetic performed in each analysis, and we remain true to that promise.
However, except for some simple computations, with every example of a statistical procedure
in which we demonstrate the arithmetic, we also give the results of a computer analysis of the
same data. For easy comparison we often locate them near each other, but in some instances
we ﬁnd it better to have a separate section for computer analysis. Because of greater
familiarity with them, we have chosen the SASw and JMPw, computer applications developed
by the SAS Institute.† SAS was initially written for use on large main frame computers, but
has been adapted for personal computers. JMP was designed for personal computers, and we
ﬁnd it more interactive than SAS. It is also more visually oriented, with graphics presented in
the output before any numerical values are given. But because SAS seems to remain the
computer application of choice, we present it more frequently than JMP.
Two additions to the text are due to responses to our survey. In the preface to the ﬁrst
edition, we stated our preference for discussing probability only when it is needed to explain
some aspect of statistical analysis, but many respondents felt a course in statistical methods
needs a formal discussion of probability. We have attempted to “have it both ways” by
including a very short presentation of probability in the ﬁrst chapter, but continuing to discuss
it as needed. Another frequent response was the idea that a statistical analysis course now
should include some minimal discussion of logistic regression. This caused us almost to
surrender to black-box instruction. It is fairly easy to understand the results of a computer
analysis of logistic regression, but many of our students have a mathematical background a bit
shy of that needed for performing logistic regression analysis. Thus we discuss it, with a
worked example, in the last section to make it available for those with the necessary
†
SAS and JMP are registered trademarks of SAS Institute Inc., Cary, NC, USA.
ix
x
PREFACE TO THE THIRD EDITION
mathematical background, but to avoid alarming other students who might see the
mathematics and feel they recognize themselves in Stevie Smith’s poem†:
Nobody heard him, the dead man,
But still he lay moaning:
I was much further out than you thought
And not waving but drowning.
Consulting with research workers at West Virginia University has caused us to add some
topics not found in earlier editions. Many of our examples and exercises reﬂect actual research
problems for which we provided the statistical analysis. That has not changed, but the research
areas that seek our help have become more global. In earlier years we assisted agricultural,
biological, and behavioral scientists who can design prospective studies, and in our text we
tried to meet the needs of their students. After helping researchers in areas such as health
science who must depend on retrospective studies, we made additions for the beneﬁt of their
students as well. We added examples to show how statistics is applied to health research and
now discuss risks, odds and their ratios, as well as repeated-measures analysis. While helping
researchers prepare manuscripts for publication, we learned that some journals prefer the
more conservative Bonferroni procedures, so we have added them to the discussion of mean
separation techniques in Chapter 10. We also have a discussion of ratio and difference
estimation. However, that inclusion may be self-serving to avoid yet another explanation of
“Why go to the all the trouble of least squares when it is so much easier to use a ratio?” Now
we can refer the questioner to the appropriate section in Chapter 9.
There are additions to the exercises as well as the body of the text. We believe our students
enjoy hearing about the research efforts of Sir Francis Galton, that delightfully eccentric but
remarkably ingenious gentleman scientist of Victorian England. To make them suitable
exercises, we have taken a few liberties with some of his research efforts, but only to
demonstrate the breadth of ideas of a pioneer who thought everything is measurable and hence
tractable to quantitative analysis. In respect for a man who—dare we say?—“thought outside
the black box,” many of the exercises that relate to Galton will require students to think on
their own as he did. We hope that, like Galton himself, those who attempt these exercises will
accept the challenge and not be too concerned when they do not succeed.
We are pleased that Daniel M. Chilko, a long-time colleague, has joined us in this
endeavor. His talents have made it easier to update sections on computer analysis, and he will
serve as webmaster for the web site that will now accompany the text.
We wish to acknowledge the help we received from many people in preparation of this
edition. Once again, we thank SAS Institute for permission to discuss their SAS and JMP
software.
We want to express our appreciation to the many readers who called to our attention a ﬂaw
in the algorithm used to prepare the Poisson conﬁdence intervals in Table A8. Because they
alerted us, we made corrections and veriﬁed all tables generated by us for this edition.
To all who responded to our survey, we are indeed indebted. We especially thank Dr.
Marta D. Remmenga, Professor at New Mexico State University. She provided us with a
detailed account of how she uses the text to teach statistics and gave us a number of helpful
suggestions for this edition. All responses were helpful, and we do appreciate the time taken
by so many to answer our questionnaire.
†
Not Waving But Drowning, The Top 500 Poems, Columbia University Press, New York.
PREFACE TO THE THIRD EDITION
xi
Even without this edition, we would be indebted to long-time colleagues in the Department
of Statistics at West Virginia University. Over the years, Erdogan Gunel, E. James Harner,
and Gerald R. Hobbs have provided the congenial atmosphere and enough help and counsel to
make our task easy and joyful.
Shirley M. Dowdy
Stanley Wearden
Daniel M. Chilko
PREFACE TO THE
SECOND EDITION
From its inception, the intent of this text has been to demystify statistical procedures for those
who employ them in their research. However, between the ﬁrst and second editions, the use of
statistics in research has been radically affected by the increased availability of computers,
especially personal computers which can also serve as terminals for access to even more
powerful computers. Consequently, we now feel a new responsibility also to try to demystify
the computer output of statistical analyses.
Wherever appropriate, we have tried to include computer output for the statistical
procedures which have just been demonstrated. We have chosen the output of the SASw
System* for this purpose. SAS was chosen not only for its relative ubiquity on campus and
research centers, but also because the SAS printout shares common features with many other
statistical analysis packages. Thus if one becomes familiar with the SAS output explained in
this text, it should not be too difﬁcult to interpret that of almost any other analysis system. In
the main, we have attempted to make the computer output relatively unobtrusive. Where it
was reasonable to do so, we placed it toward the end of each chapter and provided output of
the computer analysis of the same data for which hand-calculations had already been
discussed. For those who have ready access to computers, we have also provided exercises
containing raw data to aid in learning how to do statistics on computers.
In order to meet the new objective of demystifying computer output, we have included the
programs necessary to obtain the appropriate output from the SAS System. However, the
reader should not be mislead in believing this text can serve as a substitute for the SAS
manuals. Before one can use the information provided here, it is necessary to know how to
access the particular computer system on which SAS is available, and that is likely to be
different from one research location to another. Also, to keep the discussion of computer
output from becoming too lengthy, we have not discussed a number of other topics such as
data editing, storage, and retrieval. We feel the reader who wants to begin using computer
analysis will be better served by learning how to do so with the equipment and software
available at his or her own research center.
At the request of many who used the ﬁrst edition, we now include nonparametric statistics
in the text. However, once again with the intent of keeping these procedures from seeming to
be too arcane, we have approached each nonparametric test as an analog to a previously
discussed parametric test, the difference being in the fact that data were collected on the
nominal or ordinal scale of measurement, or else transformed to either of these scales of
measurement. The test statistics are presented in such a form that they will appear as similar as
possible to their parametric counterparts, and for that reason, we consider only large samples
*SAS is a registered trademark of SAS Institute Inc., Cary, NC, USA.
xiii
xiv
PREFACE TO THE SECOND EDITION
for which the central limit theorem will apply. As with the coverage of computer output, the
sections on nonparametric statistics are placed near the end of each chapter as material
supplementary to statistical procedures already demonstrated.
Finally, those who have reﬂected on human nature realize that when they are told “no one
does that any more,” it is really the speaker who doesn’t want to do it any more. It is in accord
with that interpretation that we say “no one does multiple regression by hand calculations any
more,” and correspondingly present considerable revision in Chapter 14. Consistent with our
intention of avoiding any appearance of mystery, we use a very small sample to present the
computations necessary for multiple regression analysis. However, more space is devoted to
examination and explanation of the computer analyses available for multiple regression
problems.
We are indebted to the SAS Institute for permission to discuss their software. Output from
SAS procedures is printed with the permission of SAS Institute Inc., Cary NC, USA,
Copyright # 1985.
We want to thank readers of the ﬁrst edition who have so kindly written to us to advise us
of misprints and confusing statements and to make suggestions for improvement. We also
want to thank our colleagues in the department, especially Donald F. Butcher, Daniel M.
Chilko, E. James Harner, Gerald R. Hobbs, William V. Thayne and Edwin C. Townsend.
They have read what we have written, made useful suggestions, and have provided data sets
and problems. We feel fortunate to have the beneﬁt of their assistance.
Shirley Dowdy
Stanley Wearden
Morgantown, West Virginia
November 1990
PREFACE TO THE FIRST EDITION
This textbook is designed for the population of students we have encountered while teaching a
two-semester introductory statistical methods course for graduate students. These students
come from a variety of research disciplines in the natural and social sciences. Most of the
students have no prior background in statistical methods but will need to use some, or all, of
the procedures discussed in this book before they complete their studies. Therefore, we
attempt to provide not only an understanding of the concepts of statistical inference but also
the methodology for the most commonly used analytical procedures.
Experience has taught us that students ought to receive their instruction in statistics early in
their graduate program, or perhaps, even in their senior year as undergraduates. This ensures
that they will be familiar with statistical terminology when they begin critical reading of
research papers in their respective disciplines and with statistical procedures before they begin
their research. We frequently ﬁnd, however, that graduate students are poor with respect to
mathematical skills; it has been several years since they completed their undergraduate
mathematics and they have not used these skills in the subsequent years. Consequently, we
have found it helpful to give details of mathematical techniques as they are employed, and we
do so in this text.
We should like our students to be aware that statistical procedures are based on sound
mathematical theory. But we have learned from our students, and from those with whom we
consult, that research workers do not share the mathematically oriented scientists’ enthusiasm
for elegant proofs of theorems. So we deliberately avoid not only theoretical proofs but even
too much of a mathematical tone. When statistics was in its infancy, W. S. Gosset replied to an
explanation of the sampling distribution of the partial correlation coefﬁcient by R. A. Fisher:†
. . . I fear that I can’t conscientiously claim to understand it, but I take it for granted that you
know what you are talking about and thankfully use the results!
It’s not so much the mathematics, I can often say “Well, of course, that’s beyond me, but
we’ll take it as correct, but when I come to ‘Evidently’ I know that means two hours hard
work at least before I can see why.
Considering that the original “Student” of statistics was concerned about whether he could
understand the mathematical underpinnings of the discipline, it is reasonable that today’s
students have similar misgivings. Lest this concern keep our students from appreciating
the importance of statistics in research, we consciously avoid theoretical mathematical
discussions.
†
From letter No. 6, May 5, 1922, in Letters From W. S. Gosset to R. A. Fisher 1915–1936, Arthur Guinness Sons and
Company, Ltd., Dublin. Issued for private circulation.
xv
xvi
PREFACE TO THE FIRST EDITION
We want to show the importance of statistics in research, and we have taken two speciﬁc
measures to accomplish this goal. First, to explain that statistics is an integral part of research,
we show from the very ﬁrst chapter of the text how it is used. We have found that our students
are impatient with textbooks that require eight weeks of preparatory work before any actual
application of statistics to relevant problems. Thus, we have eschewed the traditional
introductory discussion of probability and descriptive statistics; these topics are covered only
as they are needed. Second, we try to present a practical example of each topic as soon as
possible, often with considerable detail about the research problem. This is particularly
helpful to those who enroll in the statistical methods course before the research methods
course in their particular discipline. Many of the examples and exercises are based on actual
research situations that we have encountered in consulting with research workers. We attempt
to provide data that are reasonable but that are simpliﬁed for each of computation. We realize
that in an actual research project a statistical package on a computer will probably be used for
the computations, and we considered including printouts of computer analyses. But the
multiplicity of the currently available packages, and the rapidity with which they are
improved and revised, makes this infeasible.
It is probable that every course has an optimum pace at which it should be taught; we are
convinced that such is the case with statistical methods. Because our students come to us
unfamiliar with inductive reasoning, we start slowly and try to explain inference in
considerable detail. The pace quickens, however, as soon as the students seem familiar with
the concepts. Then when new concepts, such as bivariate distributions, are introduced, it is
necessary to pause and reestablish the gradual acceleration. Testing helps to maintain the
pace, and we ﬁnd that our students beneﬁt from frequent testing. The exercises at the end of
each section are often taken directly from these tests.
A textbook can never replace a reference book. But, many people, because they are
familiar with the text they used when they studied statistical methods, often refer to that book
for information during later professional activities. We have kept this in mind while designing
the text and have included some features that should be helpful: Summaries of procedures are
clearly set off, references to articles and books that further develop the topics discussed are
given at the end of each chapter, and explanations on reading the statistical tables are given in
the table section.
We thank Professor Donald Butcher, Chairman of the Department of Statistics and
Computer Science at West Virginia University, for his encouragement of this project. We are
also grateful for the assistance of Professor George Trapp and computer science graduate
students Barry Miller and Benito Herrera in the production of the statistical methods with us
during the preliminary version of the text.
Shirley Dowdy
Stanley Wearden
Morgantown, West Virginia
December 1982
1 The Role of Statistics
In this chapter we informally discuss how statistics is used to attempt to answer questions
raised in research. Because probability is basic to statistical decision making, we will also
present a few probability rules to show how probabilities are computed. Since this is an
overview, we make no attempt to give precise deﬁnitions. The more formal development will
follow in later chapters.
1.1. THE BASIC STATISTICAL PROCEDURE
Scientists sometimes use statistics to describe the results of an experiment or an investigation.
This process is referred to as data analysis or descriptive statistics. Scientists also use
statistics another way; if the entire population of interest is not accessible to them for some
reason, they often observe only a portion of the population (a sample) and use statistics to
answer questions about the whole population. This process is called inferential statistics.
Statistical inference is the main focus of this book.
Inferential statistics can be deﬁned as the science of using probability to make decisions.
Before explaining how this is done, a quick review of the “laws of chance” is in order. Only
four probability rules will be discussed here, those for (1) simple probability, (2) mutually
exclusive events, (3) independent events, and (4) conditional probability. For anyone wanting
more than covered here, Johnson and Kuby (2000) as well as Bennett, Briggs, and Triola
(2003) provide more detailed discussion.
Early study of probability was greatly inﬂuenced by games of chance. Wealthy games
players consulted mathematicians to learn if their losses during a night of gaming were due
to bad luck or because they did not know how to compute their chances of winning. (Of
course, there was always the possibility of chicanery, but that seemed a matter better
settled with dueling weapons than mathematical computations.) Stephen Stigler (1986)
states that formal study of probability began in 1654 with the exchange of letters between
two famous French mathematicians, Blaise Pascal and Pierre de Fermat, regarding a
question posed by a French nobleman about a dice game. The problem can be found in
Exercise 1.1.5.
In games of chance, as in experiments, we are interested in the outcomes of a random
phenomenon that cannot be predicted with certainty because usually there is more than one
outcome and each is subject to chance. The probability of an outcome is a measure of how
likely that outcome is to occur. The random outcomes associated with games of chance should
be equally likely to occur if the gambling device is fair, controlled by chance alone. Thus the
probability of getting a head on a single toss of a fair coin and the probability of getting an
even number when we roll a fair die are both 1/2.
Statistics for Research, Third Edition, Edited by Shirley Dowdy, Stanley Weardon, and Daniel Chilko.
ISBN 0-471-26735-X # 2004 John Wiley & Sons, Inc.
1
2
THE ROLE OF STATISTICS
Because of the early association between probability and games of chance, we label some
collection of equally likely outcomes as a success. A collection of outcomes is called an event.
If success is the event of an even number of pips on a fair die, then the event consists of
outcomes 2, 4, and 6. An event may consist of only one outcome, as the event head on a single
toss of a coin. The probability of a success is found by the following probability rule:
probability of success ¼
number of successful outcomes
total number of outcomes
In symbols
P(success) ¼ P(S) ¼
ns
N
where nS is the number of outcomes in the event designated as success and N is the total
number of possible outcomes. Thus the simple probability rule for equally likely outcomes is
to count the number of ways a success can be obtained and divide it by the total number of
outcomes.
Example 1.1. Simple Probability Rule for Equally Likely Outcomes
There is a game, often played at charity events, that involves tossing a coin such as a 25-cent
piece. The quarter is tossed so that it bounces off a board and into a chute to land in one of nine
glass tumblers, only one of which is red. If the coin lands in the red tumbler, the player wins
$1; otherwise the coin is lost. In the language of probability, there are N ¼ 9 possible
outcomes for the toss and only one of these can lead to a success. Assuming skill is not a factor
in this game, all nine outcomes are equally likely and P(success) ¼ 1/9.
In the game described above, P(win) ¼ 1/9 and P(loss) ¼ 8/9. We observe there is only
one way to win $1 and eight ways to lose 25¢. A related idea from the early history of
probability is the concept of odds. The odds for winning are P(win)/P(loss). Here we say,
“The odds for winning are one to eight” or, more pessimistically, “The odds against winning
are eight to one.” In general,
odds for success ¼
P(success)
1 P(success)
We need to stress that the simple probability rule above applies only to an experiment with
a discrete number of equally likely outcomes. There is a similarity in computing probabilities
for continuous variables for which there is a distribution curve for measures of the variable. In
this case
P(success) ¼
area under the curve where the measure is called a success
total area under the curve
A simple example is provided by the “spinner” that comes with many board games. The
spinner is an arrow that spins freely around an axle attached to the center of a circle. Suppose
that the circle is divided into quadrants marked 1, 2, 3, and 4 and play on the board is
determined by the quadrant in which the spinner comes to rest. If no skill is involved in
spinning the arrow, the outcomes can be considered uniformly distributed over the 3608 of the
1.1. THE BASIC STATISTICAL PROCEDURE
3
circle. If it is a success to land in the third quadrant of the circle, a spin is a success when the
arrow stops anywhere in the 908 of the third quadrant and
P(success) ¼
area in third quadrant
90
1
¼
¼
total area
360 4
While only a little geometry is needed to calculate probabilities for a uniform distribution,
knowledge of calculus is required for more complex distributions. However, ﬁnding
probabilities for many continuous variables is possible by using simple tables. This will be
explained in later chapters.
The next rule involves events that are mutually exclusive, meaning one event excludes the
possibility of another. For instance, if two dice are rolled and the event is that the sum of spots
is y ¼ 7, then y cannot possibly be another value as well. However, there are six ways that the
spots, or pips, on two dice can produce a sum of 7, and each of these is mutually exclusive of
the others. To see how this is so, imagine that the pair consists of one red die and one green;
then we can detail all the possible outcomes for the event y ¼ 7:
Red die:
Green die:
1
6
2
5
3
4
4
3
5
2
6
1
Sum:
7
7
7
7
7
7
If a success depends only on a value of y ¼ 7, then by the simple probability rule the number
of possible successes is nS ¼ 6; the number of possible outcomes is N ¼ 36 because each of
the six outcomes of the red die can be paired with each of the six outcomes of the green die and
the total number of outcomes is 6 6 ¼ 36. Thus P(success) ¼ nS/N ¼ 6/36 ¼ 1/6.
However, we need a more general statement to cover mutually exclusive events, whether or
not they are equally likely, and that is the addition rule.
If a success is any of k mutually exclusive events E1, E2, . . . , Ek, then the addition rule for
mutually exclusive events is P(success) ¼ P(E1) þ P(E2) þ þ P(Ek). This holds true with
the dice; if E1 is the event that the red die shows 1 and the green die shows 6, then P(E1) ¼
1/36. Then, because each of the k ¼ 6 events has the same probability,

P(success) ¼

1
1
1
1
1
1
6
1
þ
þ
þ
þ
þ
¼
¼
36
36
36
36
36
36
36 6
Here 1/36 is the common probability for all events, but the addition rule for mutually exclusive
events still holds true even when the probability values are not the same for all events.
Example 1.2. Addition Rule for Mutually Exclusive Events
To see how this rule applies to events that are not equally likely, suppose a coin-operated
gambling device is programmed to provide, on random plays, winnings with the following
probabilities:
Event
P(Event)
Win 10 coins
Win 5 coins
0.001
0.010
4
THE ROLE OF STATISTICS
Event
P(Event)
Win 3 coins
Win 1 coin
Lose 1 coin
0.040
0.359
0.590
Because most players consider it a success if any coins are won, P(success) ¼
0.0001 þ 0.010 þ 0.040 þ 0.359 ¼ 0.410, and the odds for winning are 0.41/0.59 ¼
0.695, while the odds against a win are 0.59/0.41 ¼ 1.44.
We might ask why we bother to add 0.0001 þ 0.010 þ 0.040 þ 0.359 to obtain
P(success) ¼ 0.41 when we can obtain it just from knowledge of P(no success). On a play at
the coin machine, one either wins of loses, so there is the probability of a success,
P(S) ¼ 0.41, and the probability of no success, P(no success) ¼ 0.59. The opposite of a
success, is called its complement, and its probability is symbolized as P(S ). In a play at the
machine there is no possibility of neither a win nor a loss, P(S) þ P(S ) ¼ 1:0, so rather than
counting the four ways to win it is easier to ﬁnd P(S) ¼ 1:0 P(S ) ¼ 1:0 0:59 ¼ 0:41. Note
that in the computation of the odds for winning we used the ratio of the probability of a win to
its complement, P(S)=P(S ).
At games of chance, people who have had a string of losses are encouraged to continue to
play with such remarks as “Your luck is sure to change” or “Odds favor your winning now,”
but is that so? Not if the plays, or events, are independent. A play in a game of chance has no
memory of what happened on previous plays. So using the results of Example 1.2, suppose we
try the machine three times. The probability of a win on the ﬁrst play is P(S1) ¼ 0.41, but the
second coin played has no memory of the fate of its predecessor, so P(S2) ¼ 0.41, and
likewise P(S3) ¼ 0.41. Thus we could insert 100 coins in the machine and lose on the ﬁrst 99
plays, but the probability that our last coin will win remains P(S100) ¼ 0.41. However, we
would have good reason to suspect the honesty of the machine rather than bad luck, for with
an honest machine for which the probability of a win is 0.41, we would expect about 41 wins
in 100 plays.
When dealing with independent events, we often need to ﬁnd the joint probability that two
or more of them will all occur simultaneously. If the total number of possible outcomes (N) is
small, we can always compile tables, so with the N ¼ 52 cards in a standard deck, we can
classify each card by color (red or black) and as to whether or not it is an honor card (ace, king,
queen, or jack). Then we can sort and count the cards in each of four groups to get the
following table:
Color
Honor
Black
Red
Total
No
18
18
36
Yes
8
8
16
Total
26
26
52
If a card is dealt at random from such a deck, we can ﬁnd the joint probability that it will be
red and an honor by noting that there are 8 such cards in the deck of 52; hence P(red and
honor) ¼ P(RH) ¼ 8/52 ¼ 2/13. This is easy enough when the total number of outcomes is
1.1. THE BASIC STATISTICAL PROCEDURE
5
small or when they have already been tabulated, but in many cases there are too many or there
is a process such as the slot machine capable of producing an inﬁnite number of outcomes.
Fortunately there is a probability rule for such situations.
The multiplication rule for ﬁnding the joint probability of k independent events E1,
E2, . . . , Ek is
P(E1 and E2 and . . . Ek ) ¼ P(E1 ) P(E2 ) P(Ek )
With the cards, k is 2, E1 is a red card, and E2 is an honor card, so P(E1E2) ¼
P(E1) P(E2) ¼ (26/52) (16/52) ¼ (1/2) (4/13) ¼ 4/26 ¼ 2/13.
Example 1.3. The Multiplication Rule for Independent Events
Gender and handedness are independent, and if P(female) ¼ 0.50 and P(left handed) ¼ 0.15,
then the probability that the ﬁrst child of a couple will be a left-handed girl is
P(female and left handed) ¼ P(female) P(left handed) ¼ 0:50 0:15 ¼ 0:075
If the probability values P(female) and P(left handed) are realistic, the computation is easier
than the alternative of trying to tabulate the outcomes of all ﬁrst births. We know the
biological mechanism for determining gender but not handedness, so it was only estimated
here. However, the value we would obtain from a tabulation of a large number of births would
also be only an estimate. We will see in Chapter 3 how to make estimates and how to say
scientiﬁcally, “The probability that the ﬁrst child will be a left-handed girl is likely
somewhere around 0.075.”
The multiplication rule is very convenient when events are independent, but frequently
we encounter events that are not independent but rather are at least partially related. Thus
we need to understand these and how to deal with them in probability. When told that a
person is from Sweden or some other Nordic country, we might immediately assume that
he or she has blue eyes, or conversely dark eyes if from a Mediterranean country. In our
encounters with people from these areas, we think we have found that the probability of
eye color P(blue) is not the same for both those geographic regions but rather depends, or
is conditioned, on the region from which a person comes. Conditional probability is
symbolized as P(E2jE1), and we say “The probability of event 2 given event 1.” In the case
of eye color, it would be the probability of blue eyes given that one is from a Nordic
country.
The conditional probability rule for ﬁnding the conditional probability of event 2 given
event 1 is
P(E2 jE1 ) ¼
P(E1 E2 )
P(E1 )
In the deck of cards, the probability a randomly dealt card will be red and an honor card is
P(red and honor) ¼ 8/52, while the probability it is red is P(R) ¼ 26/52, so the probability
that it will be an honor card, given that it is a red card is P(RH)/P(R) ¼ 8/26 ¼ 4/13, which
is the same as P(H) because the two are independent rather than related. Hence independent
events can be deﬁned as satisfying P(E2jE1) ¼ P(E2).
6
THE ROLE OF STATISTICS
Example 1.4. The Conditional Probability Rule
Suppose an oncologist is suspicious that cancer of the gum may be associated with use of
smokeless tobacco. It would be ideal if he also had data on the use of smokeless tobacco by
those free of cancer, but the only data immediately available are from 100 of his own cancer
patients, so he tabulates them to obtain the following:
Smokeless Tobacco
Cancer Site
No
Yes
Total
Gum
5
20
25
Elsewhere
60
15
75
Total
65
35
100
There are 25 cases of gum cancer in his database and 20 of those patients had used smokeless
tobacco, so we see that his best estimate of the probability that a randomly drawn gum cancer
patient was a user of smokeless tobacco is 20/25 ¼ 0.80. This probability could also be found
by the conditional probability rule. If P(gum) ¼ P(G) and P(user) ¼ P(U), then
P(UjG) ¼
P(GU) (20=100) 20
¼
¼
¼ 0:80
P(G)
(25=100) 25
Are gum cancer and use of smokeless tobacco independent? They are if P(UjG) ¼ P(U), and
from the data set, the best estimate of users among all cancer patients is P(U) ¼ 35/
100 ¼ 0.35. The discrepancy in estimates is 0.80 for gum cancer patients compared to 0.35 for
all patients. This leads us to believe that gum cancer and smokeless tobacco usage are related
rather than independent. In Chapter 5, we will see how to test to see whether or not two
variables are independent.
Odds obtained from medical data sets similar to but much larger than that in Example 1.4
are frequently cited in the news. Had the odds been the same in a data set of hundreds or
thousands of gum cancer patients, we would report that the odds were 0.80/0.20 ¼ 4.0 for
smokeless tobacco, and 0.35/0.65 ¼ 0.538 for smokeless tobacco among all cancer patients.
Then, for sake of comparison, we would report the odds ratio, which is the ratio of the two
odds, 4.0/0.538 ¼ 7.435. This ratio gives the relative frequency of smokeless tobacco users
among gum cancer patients to smokeless tobacco users among all cancer patients, and the
medical implications are ominous. For comparison, it would be helpful to have data on the
usage of smokeless tobacco in a cancer-free population, but ﬁrst information about an
association such as that in Example 1.4 usually comes from medical records for those with a
disease.
Caution is necessary when trying to interpret odds ratios, especially those based on very
low incidences of occurrence. To show a totally meaningless odds ratio, suppose we have two
data sets, one containing 20 million broccoli eaters and the other of 10 million who do not eat
the vegetable. Then, if we examine the health records of those in each group, we ﬁnd there are
two in each group suffering from chronic bladder infections. The odds ratio is 2.0, but we
would garner strange looks rather than prestige if we attempted to claim that the odds for
1.1. THE BASIC STATISTICAL PROCEDURE
7
FIGURE 1.1. Statistical inference.
chronic bladder infection is twice as great for broccoli eaters when compared to those who do
not eat the vegetable. To use statistics in research is happily more than just to compute and
report numbers.
The basic process in inferential statistics is to assign probabilities so that we can reach
conclusions. The inferences we make are either decisions or estimates about the population.
The tool for making inferences is probability (Figure 1.1).
We can illustrate this process by the following example.
Example 1.5. Using Probabilities to Make a Decision
A sociologist has two large sets of cards, set A and set B, containing data for her research. The
sets each consist of 10,000 cards. Set A concerns a group of people, half of whom are women.
In set B, 80% of the cards are for women. The two ﬁles look alike. Unfortunately, the
sociologist loses track of which is A and which is B. She does not want to sort and count the
cards, so she decides to use probability to identify the sets. The sociologist selects a set. She
draws a card at random from the selected set, notes whether or not it concerns a woman,
replaces the card, and repeats this procedure 10 times. She ﬁnds that all 10 cards contain data
about women. She must now decide between two possible conclusions:
1. This is set B.
2. This is set A, but an unlikely sample of cards has been chosen.
In order to decide in favor of one of these conclusions, she computes the probabilities of
obtaining 10 cards all for females:
P(10 females) ¼ P(first is female)
P(second is female) P(tenth is female)
The multiplication rule is used because each choice is independent of the others. For the set A,
the probability of selecting 10 cards for females is (0.50)10 ¼ 0.00098 (rounded to two
signiﬁcant digits). For set B, the probability of 10 cards for females is (0.80)10 ¼ 0.11 (again
rounded to two signiﬁcant digits). Since the probability of all 10 of the cards being for women
8
THE ROLE OF STATISTICS
if the set is B is about 100 times the probability if the set is A, she decides that the set is B, that
is, she decides in favor of the conclusion with the higher probability.
When we use a strategy based on probability, we are not guaranteed success every time.
However, if we repeat the strategy, we will be correct more often than mistaken. In the above
example, the sociologist could make the wrong decision because 10 cards chosen at random
from set A could all be cards for women. In fact, in repeated experiments using set A, 10 cards
for females will appear approximately 0.098% of the time, that is, almost once in every
thousand 10-card samples.
The example of the ﬁles is artiﬁcial and oversimpliﬁed. In real life, we use statistical
methods to reach conclusions about some signiﬁcant aspect of research in the natural,
physical, or social sciences. Statistical procedures do not furnish us with proofs, as do many
mathematical techniques. Rather, statistical procedures establish probability bases on which
we can accept or reject certain hypotheses.
Example 1.6. Using Probability to Reach a Conclusion in Science
A real example of the use of statistics in science is the analysis of the effectiveness of Salk’s
polio vaccine.
A great deal of work had to be done prior to the actual experiment and the statistical
analysis. Dr. Jonas Salk ﬁrst had to gather enough preliminary information and experience in
his ﬁeld to know which of the three polio viruses to use. He had to solve the problem of how to
culture that virus. He also had to determine how long to treat the virus with formaldehyde so
that it would die but retain its protein shell in the same form as the live virus; the shell could
then act as an antigen to stimulate the human body to develop antibodies. At this point, Dr.
Salk could conjecture that the dead virus might be used as a vaccine to give patients immunity
to paralytic polio.
Finally, Dr. Salk had to decide on the type of experiment that would adequately test his
conjecture. He decided on a double-blind experiment in which neither patient nor doctor knew
whether the patient received the vaccine or a saline solution. The patients receiving the saline
solution would form the control group, the standard for comparison. Only after all these
preliminary steps could the experiment be carried out.
When Dr. Salk speculated that patients inoculated with the dead virus would be immune to
paralytic polio, he was formulating the experimental hypothesis: the expected outcome if the
experimenter’s speculation is true. Dr. Salk wanted to use statistics to make a decision about
this experimental hypothesis. The decision was to be made solely on the basis of probability.
He made the decision in an indirect way; instead of considering the experimental hypothesis
itself, he considered a statistical hypothesis called the null hypothesis—the expected outcome
if the vaccine is ineffective and only chance differences are observed between the two sample
groups, the inoculated group and the control group. The null hypothesis is often called the
hypothesis of no difference, and it is symbolized H0. In Dr. Salk’s experiment, the null
hypothesis is that the incidence of paralytic polio in the general population will be the same
whether it receives the proposed vaccine or the saline solution. In symbols†
H0 : p I ¼ pC
The use of the symbol p has nothing to do with the geometry of circles or the irrational number 3.1416 . . . .
†
1.1. THE BASIC STATISTICAL PROCEDURE
9
in which p I is the proportion of cases of paralytic polio in the general population if it were
inoculated with the vaccine and p C is the proportion of cases if it received the saline solution.
If the null hypothesis is true, then the two sample groups in the experiment should be alike
except for chance differences of exposure and contraction of the disease.
The experimental results were as follows:
Proportion with
Paralytic Polio
Number in
Study
Inoculated Group
0.0001603
200,745
Control Group
0.0005703
201,229
The incidence of paralytic polio in the control group was almost four times higher than in the
inoculated group, or in other words the odds ratio was 0.0005703/0.0001603 ¼ 3.56.
Dr. Salk then found the probability that these experimental results or more extreme ones
could have happened with a true null hypothesis. The probability that p I ¼ p C and the
difference between the two experimental groups was caused by chance was less than 1 in
10,000,000, so Salk rejected the null hypothesis and decided that he had found an effective
vaccine for the general public.†
Usually when we experiment, the results are not as conclusive as the result obtained by Dr.
Salk. The probabilities will always fall between 0 and 1, and we have to establish a level
below which we reject the null hypothesis and above which we accept the null hypothesis. If
the probability associated with the null hypothesis is small, we reject the null hypothesis and
accept an alternative hypothesis (usually the experimental hypothesis). When the probability
associated with the null hypothesis is large, we accept the null hypothesis. This is one of the
basic procedures of statistical methods—to ask: What is the probability that we would get
these experimental results (or more extreme ones) with a true null hypothesis?
Since the experiment has already taken place, it may seem after the fact to ask for the
probability that only chance caused the difference between the observed results and the null
hypothesis. Actually, when we calculate the probability associated with the null hypothesis,
we are asking: If this experiment were performed over and over, what is the probability that
chance will produce experimental results as different as are these results from what is
expected on the basis of the null hypothesis?
We should also note that Salk was interested not only in the samples of 401,974 people
who took part in the study; he was also interested in all people, then and in the future, who
could receive the vaccine. He wanted to make an inference to the entire population from the
portion of the population that he was able to observe. This is called the target population, the
population about which the inference is intended.
Sometimes in science the inference we should like to make is not in the form of a decision
about a hypothesis; but rather it consists of an estimate. For example, perhaps we want to
estimate the proportion of adult Americans who approve of the way in which the president is
handling the economy, and we want to include some statement about the amount of error
possibly related to this estimate. Estimation of this type is another kind of inference, and
it also depends on probability. For simplicity, we focus on tests of hypotheses in this
†
This probability is found using a chi-square test (see Section 5.3).
10
THE ROLE OF STATISTICS
introductory chapter. The ﬁrst example of inference in the form of estimation is discussed in
Chapter 3.
EXERCISES
1.1.1. A trial mailing is made to advertise a new science dictionary. The trial mailing list is
made up of random samples of current mailing lists of several popular magazines. The
number of advertisements mailed and the number of people who ordered the dictionary
are as follows:
Mailed:
Ordered:
A
B
Magazine
C
D
E
900
18
810
15
1100
10
890
30
950
45
a. Estimate the probability and the odds that a subscriber to each of the magazines will
buy the dictionary.
b. Make a decision about the mailing list that will probably produce the highest
percentage of sales if the entire list is used.
1.1.2. In Examples 1.5 and 1.6, probability was used to make decisions and odds ratios could
have been used to further support the decisions. To do so:
a. For the data in Example 1.5, compute the odds ratio for the two sets of cards.
b. For the data in Example 1.6, compute the odds ratio of getting polio for those
vaccinated as opposed to those not vaccinated.
1.1.3. If 60% of the population of the United States need to have their vision corrected, we
say that the probability that an individual chosen at random from the population needs
vision correction is P(C) ¼ 0.60.
a. Estimate the probability that an individual chosen at random does not need vision
correction. Hint: Use the complement of a probability.
b. If 3 people are chosen at random from the population, what is the probability that all
3 need correction, P(CCC)? Hint: Use the multiplication law of probability for
independent events.
c. If 3 people are chosen at random from the population, what is the probability that
the second person does not need correction but the ﬁrst and the third do, P(CNC)?
d. If 3 people are chosen at random from the population, what is the probability that 1
out of the 3 needs correction, P(CNN or NCN or NNC)? Hint: Use the addition law
of probability for mutually exclusive events.
e. Assuming no association between vision and gender, what is the probability that a
randomly chosen female needs vision correction, P(CjF)?
1.1.4. On a single roll of 2 dice (think of one green and the other red to keep track of all
outcomes) in the game of craps, ﬁnd the probabilities for:
a. A sum of 6, P( y ¼ 6)
1.2. THE SCIENTIFIC METHOD
11
b. A sum of 8, P( y ¼ 8)
c. A win on the ﬁrst roll; that is, a sum of 7 or 11, P( y ¼ 7 or 11)
d. A loss on the ﬁrst roll; that is, a sum of 2, 3, or 12, P( y ¼ 2, 3, or 12)
1.1.5. The dice game about which Pascal and de Fermat were asked consisted in throwing a
pair of dice 24 times. The problem was to decide whether or not to bet even money on
the occurrence of at least one “double 6” during the 24 throws of a pair of dice. Because
it is easier to solve this problem by ﬁnding the complement, take the following steps:
a. What is the probability of not a double 6 on a roll, P(E) ¼ P( y = 12)?
b. What is the probability that y ¼ 12 on all 24 rolls, P(E1E2, . . . , E24)?
c. What is the probability of at least one double 6?
d. What are the odds of a win in this game?
1.1.6. Sir Francis Galton (1822– 1911) was educated as a physician but had the time, money,
and inclination for research on whatever interested him, and almost everything did.
Though not the ﬁrst to notice that he could ﬁnd no two people with the same
ﬁngerprints, he was the ﬁrst to develop a system for categorizing ﬁngerprints and to
persuade Scotland Yard to use ﬁngerprints in criminal investigation. He supported his
argument with ﬁngerprints of friends and volunteers solicited through the newspapers,
and for all comparisons P(ﬁngerprints match) ¼ 0. To compute the number of events
associated with Galton’s data:
a. Suppose ﬁngerprints on only 10 individuals are involved.
i. How many comparisons between individuals can be made? Hint: Fingerprints
of the ﬁrst individual can be compared to those of the other 9. However, for the
second individual there are only 8 additional comparisons because his
ﬁngerprints have already been compared to the ﬁrst.
ii. How many comparisons between ﬁngers can be made? Assume these are
between corresponding ﬁngers of both individuals in a comparison, right thumb
of one versus right thumb of the other, and so on.
b. Suppose ﬁngerprints are available on 11 individuals rather than 10. Use the results
already obtained to simplify computations in ﬁnding the number of comparisons
among people and among ﬁngers.
1.2. THE SCIENTIFIC METHOD
The natural, physical, and social scientists who use statistical methods to reach conclusions all
approach their problems by the same general procedure, the scientiﬁc method. The steps
involved in the scientiﬁc method are:
1. State the problem.
2. Formulate the hypothesis.
3. Design the experiment or survey.
4. Make observations.
5. Interpret the data.
6. Draw conclusions.
12
THE ROLE OF STATISTICS
We use statistics mainly in step 5, “interpret the data.” In an indirect way we also use
statistics in steps 2 and 3, since the formulation of the hypothesis and the design of the
experiment or survey must take into consideration the type of statistical procedure to be used
in analyzing the data.
The main purpose of this book is to examine step 5. We frequently discuss the other steps,
however, because an understanding of the total procedure is important. A statistical analysis
may be ﬂawless, but it is not valid if data are gathered incorrectly. A statistical analysis may
not even be possible if a question is formulated in such a way that a statistical hypothesis
cannot be tested. Considering all of the steps also helps those who study statistical methods
before they have had much practical experience in using the scientiﬁc method. A full
discussion of the scientiﬁc method is outside the scope of this book, but in this section we
make some comments on the ﬁve steps.
STEP 1. STATE THE PROBLEM . Sometimes, when we read reports of research, we get the
impression that research is a very orderly analytic process. Nothing could be further from the
truth. A great deal of hidden work and also a tremendous amount of intuition are involved
before a solvable problem can even be stated. Technical information and experience are
indispensable before anyone can hope to formulate a reasonable problem, but they are not
sufﬁcient. The mediocre scientist and the outstanding scientist may be equally familiar with
their ﬁeld; the difference between them is the intuitive insight and skill that the outstanding
scientist has in identifying relevant problems that he or she can reasonably hope to solve.
One simple technique for getting a problem in focus is to formulate a clear and explicit
statement of the problem and put the statement in writing. This may seem like an unnecessary
instruction for a research scientist; however, it is frequently not followed. The consequence is
a vagueness and lack of focus that make it almost impossible to proceed. It leads to the
collection of unnecessary information or the failure to collect essential information.
Sometimes the original question is even lost as the researcher gets involved in the details of
the experiment.
STEP 2. FORMULATE THE HYPOTHESIS . The “hypothesis” in this step is the experimental
hypothesis, the expected outcome if the experimenter’s speculations are true. The
experimental hypothesis must be stated in a precise way so that an experiment can be
carried out that will lead to a decision about the hypothesis. A good experimental hypothesis is
comprehensive enough to explain a phenomenon and predict unknown facts and yet is stated
in a simple way. Classic examples of good experimental hypotheses are Mendel’s laws, which
can be used to explain hereditary characteristics (such as the color of ﬂowers) and to predict
what form the characteristics will take in the future.
Although the null hypothesis is not used in a formal way until the data are being
interpreted, it is appropriate to formulate the null hypothesis at this time in order to verify that
the experimental hypothesis is stated in such a way that it can be tested by statistical
techniques.
Several experimental hypotheses may be connected with a single problem. Once these
hypotheses are formulated in a satisfactory way, the investigator should do a literature search
to see whether the problem has already been solved, whether or not there is hope of solving it,
and whether or not the answer will make a worthwhile contribution to the ﬁeld.
STEP 3. DESIGN THE EXPERIMENT OR SURVEY . Included in this step are several
decisions. What treatments or conditions should be placed on the objects or subjects of the
investigation in order to test the hypothesis? What are the variables of interest, that is,
what variables should be measured? How will this be done? With how much precision?
Each of these decisions is complex and requires experience and insight into the particular
area of investigation.
1.2. THE SCIENTIFIC METHOD
13
Another group of decisions involves the choice of the sample, that portion of the
population of interest that will be used in the study. The investigator usually tries to utilize
samples that are:
(a) Random
(b) Representative
(c) Sufﬁciently large
In order to make a decision based on probability, it is necessary that the sample be random.
Random samples make it possible to determine the probabilities associated with the study.
A sample is random if it is just as likely that it will be picked from the population of interest as
any other sample of that size. Strictly speaking, statistical inference is not possible unless
random samples are used. (Speciﬁc methods for achieving random samples are discussed in
Section 2.2.)
Random, however, does not mean haphazard. Haphazard processes often have hidden
factors that inﬂuence the outcome. For example, one scientist using guinea pigs thought that
time could be saved in choosing a treatment group and a control group by drawing the
treatment group of animals from a box without looking. The scientist drew out half of the
guinea pigs for testing and reserved the rest for the control group. It was noticed, however, that
most of the animals in the treatment group were larger than those in the control group. For
some reason, perhaps because they were larger, or slower, the heavier guinea pigs were drawn
ﬁrst. Instead of this haphazard selection, the experimenter could have recorded the animals’
ear-tattoo numbers on plastic disks and drawn the disks at random from a box.
Unfortunately, in many ﬁelds of investigation random sampling is not possible, for
example, meteorology, some medical research, and certain areas of economics. Random
samples are the ideal, but sometimes only nonrandom data are available. In these cases the
investigator may decide to proceed with statistical inference, realizing, of course, that it is
somewhat risky. Any ﬁnal report of such a study should include a statement of the author’s
awareness that the requirement of randomness for inference has not been met.
The second condition that an investigator often seeks in a sample is that it be
representative. Usually we do not know how to ﬁnd truly representative samples. Even when
we think we can ﬁnd them, we are often governed by a subconscious bias.
A classic example of a subconscious bias occurred at a Midwestern agricultural station in
the early days of statistics. Agronomists were trying to predict the yield of a certain crop in a
ﬁeld. To make their prediction, they chose several 6-ft 6-ft sections of the ﬁeld which they
felt were representative of the crop. They harvested those sections, calculated the arithmetic
average of the yields, then multiplied this average by the number of 36-ft2 sections in the ﬁeld
to estimate the total yield. A statistician assigned to the station suggested that instead they
should have picked random sections. After harvesting several random sections, a second
average was calculated and used to predict the total yield. At harvest time, the actual yield of
the ﬁeld was closer to the yield predicted by the statistician. The agronomists had predicted a
much larger yield, probably because they chose sections that looked like an ideal crop. An
entire ﬁeld, of course, is not ideal. The unconscious bias of the agronomists prevented them
from picking a representative sample. Such unconscious bias cannot occur when experimental
units are chosen at random.
Although representativeness is an intuitively desirable property, in practice it is usually
an impossible one to meet. How can a sample of 30 possibly contain all the properties of a
population of 2000 individuals? The 2000 certainly have more characteristics than can
14
THE ROLE OF STATISTICS
possibly be proportionately reﬂected in 30 individuals. So although representativeness
seems necessary for proper reasoning from the sample to the population, statisticians
do not rely on representative samples—rather, they rely on random samples. (Large
random samples will very likely be representative). If we do manage to deliberately
construct a sample that is representative but is not random, we will be unable to compute
probabilities related to the sample and, strictly speaking, we will be unable to do statistical
inference.
It is also necessary that samples be sufﬁciently large. No one would question the necessity
of repetition in an experiment or survey. We all know the danger of generalizing from a single
observation. Sufﬁciently large, however, does not mean massive repetition. When we use
statistics, we are trying to get information from relatively small samples. Determining a
reasonable sample size for an investigation is often difﬁcult. The size depends upon the
magnitude of the difference we are trying to detect, the variability of the variable of interest,
the type of statistical procedure we are using, the seriousness of the errors we might make, and
the cost involved in sampling. (We make further remarks on sample size as we discuss various
procedures throughout this text.)
STEP 4. MAKE OBSERVATIONS . Once the procedure for the investigation has been decided
upon, the researcher must see that it is carried out in a rigorous manner. The study should be
free from all errors except random measurement errors, that is, slight variations that are due to
the limitations of the measuring instrument.
Care should be taken to avoid bias. Bias is a tendency for a measurement on a variable to
be affected by an external factor. For example, bias could occur from an instrument out of
calibration, an interviewer who inﬂuences the answers of a respondent, or a judge who sees
the scores given by other judges. Equipment should not be changed in the middle of an
experiment, and judges should not be changed halfway through an evaluation.
The data should be examined for unusual values, outliers, which do not seem to be
consistent with the rest of the observations. Each outlier should be checked to see whether
or not it is due to a recording error. If it is an error, it should be corrected. If it cannot
be corrected, it should be discarded. If an outlier is not an error, it should be given
special attention when the data are analyzed. For further discussion, see Barnett and Lewis
(2002).
Finally, the investigator should keep a complete, legible record of the results of the
investigation. All original data should be kept until the analysis is completed and the ﬁnal
report written. Summaries of the data are often not sufﬁcient for a proper statistical analysis.
STEP 5. INTERPRET THE DATA . The general statistical procedure was illustrated in
Example 1.6, in which the Salk vaccine experiment was discussed. To interpret the data, we
set up the null hypothesis and then decide whether the experimental results are a rare outcome
if the null hypothesis is true. That is, we decide whether the difference between the
experimental outcome and the null hypothesis is due to more than chance; if so, this indicates
that the null hypothesis should be rejected.
If the results of the experiment are unlikely when the null hypothesis is true, we reject the
null hypothesis; if they are expected, we accept the null hypothesis. We must remember,
however, that statistics does not prove anything. Even Dr. Salk’s result, with a probability of
less than 1 in 10,000,000 that chance was causing the difference between the experimental
outcome and the null hypothesis, does not prove that the null hypothesis is false. An extremely
small probability, however, does make the scientist believe that the difference is not due to
chance alone and that some additional mechanism is operating.
Two slightly different approaches are used to evaluate the null hypothesis. In practice,
they are often intermingled. Some researchers compute the probability that the
1.2. THE SCIENTIFIC METHOD
15
experimental results, or more extreme values, could occur if the null hypothesis is true;
then they use that probability to make a judgment about the null hypothesis. In research
articles this is often reported as the observed signiﬁcance level, or the signiﬁcance level, or
the P value. If the P value is large, they conclude that the data are consistent with the null
hypothesis. If the P value is small, then either the null hypothesis is false or the null
hypothesis is true and a rare event has occurred. (This was the approach used in the Salk
vaccine example.)
Other researchers prefer a second, more decisive approach. Before the experiment they
decide on a rejection level, the probability of an unlikely event (sometimes this is also called
the signiﬁcance level). An experimental outcome, or a more extreme one, that has a
probability below this level is considered to be evidence that the null hypothesis is false. Some
research articles are written with this approach. It has the advantage that only a limited
number of probability tables are necessary. Without a computer, it is often difﬁcult to
determine the exact P value needed for the ﬁrst approach. For this reason the second approach
became popular in the early days of statistics. It is still frequently used.
The sequence in this second procedure is:
(a) Assume H0 is true and determine the probability P that the experimental outcome or a
more extreme one would occur.
(b) Compare the probability to a preset rejection level symbolized by a (the Greek letter
alpha).
(c) If P a, reject H0. If P . a, accept H0.
If P . a, we say, “Accept the null hypothesis.” Some statisticians prefer not to use that
expression, since in the absence of evidence to reject the null hypothesis, they choose simply
to withhold judgment about it. This group would say, “The null hypothesis may be true” or
“There is no evidence that the null hypothesis is false.”
If the probability associated with the null hypothesis is very close to a, more extensive
testing may be desired. Notice that this is a blend of the two approaches.
An example of the total procedure follows.
Example 1.7. Using a Statistical Procedure to Interpret Data
A manufacturer of baby food gives samples of two types of baby cereal, A and B, to a random
sample of four mothers. Type A is the manufacturer’s brand, type B a competitor’s. The
mothers are asked to report which type they prefer. The manufacturer wants to detect any
preference for their cereal if it exists.
The null hypothesis, or the hypothesis of no difference, is H0 : p ¼ 1=2, in which p is the
proportion of mothers in the general population who prefer type A. The experimental
hypothesis, which often corresponds to a second statistical hypothesis called the alternative
hypothesis, is that there is a preference for cereal A, Ha : p . 1=2.
Suppose that four mothers are asked to choose between the two cereals. If there is no
preference, the following 16 outcomes are possible with equal probability:
AAAA
BAAA
ABAA
AABA
AAAB
BBAA
BABA
BAAB
ABBA
ABAB
AABB
BBBA
BBAB
BABB
ABBB
BBBB
16
THE ROLE OF STATISTICS
The manufacturer feels that only 1 of these 16 cases, AAAA, is very different from what
would be expected to occur under random sampling, when the null hypothesis of no
preference is true. Since the unusual case would appear only 1 time out of 16 times when the
null hypothesis is true, a (the rejection level) is set equal to 1/16 ¼ 0.0625.
If the outcome of the experiment is in fact four choices of type A, then P ¼ P(AAAA) ¼
1/16, and the manufacturer can say that the results are in the region of rejection, or the results
are signiﬁcant, and the null hypothesis is rejected. If the outcome is three choices of type
A, however, then P ¼ P(3 or more A’s) ¼ P(AAAB or AABA or ABAA or BAAA or
AAAA) ¼ 5/16 . 1/16, and he does not reject the null hypothesis. (Notice that P is the
probability of this type of outcome or a more extreme one in the direction of the alternative
hypothesis, so AAAA must be included.)
The way in which we set the rejection level a depends on the ﬁeld of research, on the
seriousness of an error, on cost, and to a great degree on tradition. In the example above, the
sample size is 4, so an a smaller than 1/16 is impossible. Later (in Section 3.2), we discuss
using the seriousness of errors to determine a reasonable a . If the possible errors are not
serious and cost is not a consideration, traditional values are often used.
Experimental statistics began about 1920 and was not used much until 1940, but it is
already tradition bound. In the early part of the twentieth century Karl Pearson had his
students at University College, London, compute tables of probabilities for reasonably rare
events. Now computers are programmed to produce these tables, but the traditional levels
used by Pearson persist for the most part. Tables are usually calculated for a equal to 0.10,
0.05, and 0.01. Many times there is no justiﬁcation for the use of one of these values except
tradition and the availability of tables. If an a close to but less than or equal to 0.05 were
desired in the example above, a sample size of at least 5 would be necessary, then a ¼
1=32 ¼ 0:03125 if the only extreme case is AAAAA.
STEP 6. DRAW CONCLUSIONS . If the procedure just outlined is followed, then our
decisions will be based solely on probability and will be consistent with the data from the
experiment. If our experimental results are not unusual for the null hypothesis, P . a, then
the null hypothesis seems to be right and we should not reject it. If they are unusual,
P a, then the null hypothesis seems to be wrong and we should reject it. We repeat
that our decision could be incorrect, since there is a small probability a that we will reject
a null hypothesis when in fact that null hypothesis is true; there is also a possibility
that a false null hypothesis will be accepted. (These possible errors are discussed in
Section 3.2.)
In some instances, the conclusion of the study and the statistical decision about the null
hypothesis are the same. The conclusion merely states the statistical decision in speciﬁc
terms. In many situations, the conclusion goes further than the statistical decision. For
example, suppose that an orthodontist makes a study of malocclusion due to crowding of
the adult lower front teeth. The orthodontist hypothesizes that the incidence is as common
in males as in females, H0 : p M ¼ p F . (Note that in this example the experimental
hypothesis coincides with the null hypothesis.) In the data gathered, however, there is a
preponderance of males and P a . The statistical decision is to reject the null hypothesis,
but this is not the ﬁnal statement. Having rejected the null hypothesis, the orthodontist
concludes the report by stating that this condition occurs more frequently in males than in
females and advises family dentists of the need to watch more closely for tendencies of
this condition in boys than in girls.
EXERCISES
17
EXERCISES
1.2.1. Put the example of the cereals in the framework of the scientiﬁc method, elaborating on
each of the six steps.
1.2.2. State a null and alternative hypotheses for the example of the ﬁle cards in Section 1.1,
Example 1.5.
1.2.3. In the Salk experiment described in Example 1.6 of Section 1.1:
a. Why should Salk not be content just to reject the null hypothesis?
b. What conclusion could be drawn from the experiment?
1.2.4. Two college roommates decide to perform an experiment in extrasensory perception
(ESP). Each produces a snapshot of his home-town girl friend, and one snapshot is
placed in each of two identical brown envelopes. One of the roommates leaves the
room and the other places the two envelopes side by side on the desk. The ﬁrst
roommate returns to the room and tries to pick the envelope that contains his girl
friend’s picture. The experiment is repeated 10 times. If the one who places the
envelopes on the desk tosses a coin to decide which picture will go to the left and which
to the right, the probabilities for correct decisions are listed below.
Number of
Correct Decisions
Probability
0
1
2
3
4
5
1/1024
10/1024
45/1024
120/1024
210/1024
252/1024
Number of
Correct Decisions
Probability
6
7
8
9
10
210/1024
120/1024
45/1024
10/1024
1/1024
a. State the null hypothesis based on chance as the determining factor in a correct
decision. (Make the statement in words and symbols.)
b. State an alternative hypothesis based on the power of love.
c. If a is set as near 0.05 as possible, what is the region of rejection, that is, what
numbers of correct decisions would provide evidence for ESP?
d. What is the region of acceptance, that is, those numbers of correct decisions that
would not provide evidence of ESP?
e. Suppose the ﬁrst roommate is able to pick the envelope containing his girl friend’s
picture 10 times out of 10; which of the following statements are true?
i. The null hypothesis should be rejected.
ii. He has demonstrated ESP.
iii. Chance is not likely to produce such a result.
iv. Love is more powerful than chance.
v. There is sufﬁcient evidence to suspect that something other than chance was
guiding his selections.
vi. With his luck he should raise some money and go to Las Vegas.
18
THE ROLE OF STATISTICS
1.2.5. The mortality rate of a certain disease is 50% during the ﬁrst year after diagnosis. The
chance probabilities for the number of deaths within a year from a group of six persons
with the disease are:
Number of deaths:
Probability:
0
1
2
3
4
5
6
1/64
6/64
15/64
20/64
15/64
6/64
1/64
A new drug has been found that is helpful in cases of this disease, and it is hoped that it
will lower the death rate. The drug is given to 6 persons who have been diagnosed as
having the disease. After a year, a statistical test is performed on the outcome in order
to make a decision about the effectiveness of the drug.
a. What is the null hypothesis, in words and symbols?
b. What is the alternative hypothesis, based on the prior evidence that the drug is of
some help?
c. What is the region of rejection if a is set as close to 0.10 as possible?
d. What is the region of acceptance?
e. Suppose that 4 of the 6 persons die within one year. What decision should be made
about the drug?
1.2.6. A company produces a new kind of decaffeinated coffee which is thought to have a
taste superior to the three currently most popular brands. In a preliminary random
sample, 20 consumers are presented with all 4 kinds of coffee (in unmarked containers
and in random order), and they are asked to report which one tastes best. If all 4 taste
equally good, there is a 1-in-4 chance that a consumer will report that the new product
tastes best. If there is no difference, the probabilities for various numbers of consumers
indicating by chance that the new product is best are:
Number picking new product:
0
1
2
3
4
Probability:
0.003
0.021
0.067
0.134
0.190
Number picking new product:
5
6
7
8
9
Probability:
0.202
0.169
0.112
0.061
0.027
Number picking new product:
10
11
12
13 – 20
Probability:
0.010
0.003
0.001
,0.001
a. State the null and alternative hypotheses, in words and symbols.
b. If a is set as near 0.05 as possible, what is the region of rejection? What is the region
of acceptance?
c. Suppose that 6 of the 20 consumers indicate that they prefer the new product. Which
of the following statements is correct?
i. The null hypothesis should be rejected.
ii. The new product has a superior taste.
1.3. EXPERIMENTAL DATA AND SURVEY DATA
19
iii. The new product is probably inferior because fewer than half of the people
selected it.
iv. There is insufﬁcient evidence to support the claim that the new product has a
superior taste.
1.3. EXPERIMENTAL DATA AND SURVEY DATA
An experiment involves the collection of measurements or observations about populations
that are treated or controlled by the experimenter. A survey, in contrast to an experiment, is an
examination of a system in operation in which the investigator does not have an opportunity to
assign different conditions to the objects of the study. Both of these methods of data collection
may be the subject of statistical analysis; however, in the case of surveys some cautions are in
order.
We might use a survey to compare two countries with different types of economic
systems. If there is a signiﬁcant difference in some economic measure, such as per-capita
income, it does not mean that the economic system of one country is superior to the other.
The survey takes conditions as they are and cannot control other variables that may affect
the economic measure, such as comparative richness of natural resources, population
health, or level of literacy. All that can be concluded is that at this particular time a
signiﬁcant difference exists in the economic measure. Unfortunately, surveys of this type
are frequently misinterpreted.
A similar mistake could have been made in a survey of the life expectancy of men and
women. The life expectancy was found to be 74.1 years for men and 79.5 years for women.
Without control for risk factors—smoking, drinking, physical inactivity, stressful occupation,
obesity, poor sleeping patterns, and poor life satisfaction—these results would be of little
value. Fortunately, the investigators gathered information on these factors and found that
women have more high-risk characteristics than men but still live longer. Because this was a
carefully planned survey, the investigators were able to conclude that women biologically
have greater longevity.
Surveys in general do not give answers that are as clear-cut as those of experiments. If an
experiment is possible, it is preferred. For example, in order to determine which of two
methods of teaching reading is more effective, we might conduct a survey of two schools that
are each using a different one of the methods. But the results would be more reliable if we
could conduct an experiment and set up two balanced groups within one school, teaching each
group by a different method.
From this brief discussion it should not be inferred that surveys are not trustworthy. Most
of the data presented as evidence for an association between heavy smoking and lung cancer
come from surveys. Surveys of voter preference cause certain people to seek the presidency
and others to decide not to enter the campaign. Quantitative research in many areas of social,
biological, and behavioral science would be impossible without surveys. However, in surveys
we must be alert to the possibility that our measurements may be affected by variables that are
not of primary concern. Since we do not have as much control over these variables as we have
in an experiment, we should record all concomitant information of pertinence for each
observation. We can then study the effects of these other variables on the variable of interest
and possibly adjust for their effects.
20
THE ROLE OF STATISTICS
EXERCISES
1.3.1. In each of the research situations described below, determine whether the researcher is
conducting an experiment or a survey.
a. Traps are set out in a grain ﬁeld to determine whether rabbits or raccoons are the
more frequently found pests.
b. A graduate student in English literature uses random 500-word passages from the
writings of Shakespeare and Marlowe to determine which author uses the
conditional tense more frequently.
c. A random sample of hens is divided into 2 groups at random. The ﬁrst group is
given minute quantities of an insecticide containing an organic phosphorus
compound; the second group acts as a control group. The average difference in
eggshell thickness between the 2 groups is then determined.
d. To determine whether honeybees have a color preference in ﬂowers, an apiarist
mixes a sugar-and-water solution and puts equal amounts in 2 equal-sized sets of
vials of different colors. Bees are introduced into a cage containing the vials, and the
frequency with which bees visit vials of each color is recorded.
1.3.2. In each of the following surveys, what besides the mechanism under study could have
contributed to the result?
a. An estimation of per-capita wealth for a city is made from a random sample of
people listed in the city’s telephone directory.
b. Political preference is determined by an interviewer taking a random sample of
Monday morning bank customers.
c. The average length of ﬁsh in a lake is estimated by:
i. The average length of ﬁsh caught, reported by anglers
ii. The average length of dead ﬁsh found ﬂoating in the water
d. The average number of words in the working vocabulary of ﬁrst-grade children in a
given county is estimated by a vocabulary test given to a random sample of ﬁrstgrade children in the largest school in the country.
e. The proportion of people who can distinguish between two similar tones is
estimated on the basis of a test given to a random sample of university students in a
music appreciation class.
1.3.3. Time magazine once reported that El Paso’s water was heavily laced with lithium, a
tranquilizing chemical, whereas Dallas had a low lithium level. Time also reported that
FBI statistics showed that El Paso had 2889 known crimes per 100,000 population and
Dallas had 5970 known crimes per 100,000 population. The article reported that a
University of Texas biochemist felt that the reason for the lower crime rate in El Paso
lay in El Paso’s water. Comment on the biochemist’s conjecture.
1.4. COMPUTER USAGE
The practice of statistics has been radically changed now that computers and high-quality
statistical software are readily available and relatively inexpensive. It is no longer necessary to
spend large amounts of time doing the numerous calculations that are part of a statistical
analysis. We need only enter the data correctly, choose the appropriate procedure, and then
have the computer take care of the computational details.
REVIEW EXERCISES
21
Because the computer can do so much for us, it might seem that it is now unnecessary to
study statistics. Nothing could be further from the truth. Now more than ever the researcher
needs a solid understanding of statistical analysis. The computer does not choose the
statistical procedure or make the ﬁnal interpretation of the results; these steps are still in the
hands of the investigator.
Statistical software can quickly produce a large variety of analyses on data regardless of
whether these analyses correspond to the way in which the data were collected. An
inappropriate analysis yields results that are meaningless. Therefore, the researcher must learn
the conditions under which it is valid to use the various analyses so that the selection can be
made correctly.
The computer program will produce a numerical output. It will not indicate what the
numbers mean. The researcher must draw the statistical conclusion and then translate it into
the concrete terms of the investigation. Statistical analysis can best be described as a search
for evidence. What the evidence means and how much weight to give to it must be decided by
the researcher.
In this text we have included some computer output to illustrate how the output could be
used to perform some of the analyses that are discussed. Several exercises have computer
output to assist the user with analyzing the data. Additional output illustrating nearly all the
procedures discussed is available on an Internet website.
Many different comprehensive statistical software packages are available and the outputs
are very similar. A researcher familiar with the output of one package will probably ﬁnd it
easy to understand the output of a different package. We have used two particular packages,
the SAS system and JMP, for the illustrations in the text. The SAS system was designed
originally for batch use on the large mainframe computers of the 1970’s. JMP was originally
designed for interactive use on the personal computers of the 1980’s. SAS made it possible to
analyze very large sets of data simply and efﬁciently. JMP made it easy to visualize smaller
sets of data. Because the distinction between large and small is frequently unclear, it is useful
to know about both programs.
The computer could be used to do many of the exercises in the text; however, some
calculations by the reader are still necessary in order to keep the computer from becoming a
magic box. It is easier for the investigator to select the right procedure and to make a proper
interpretation if the method of computation is understood.
REVIEW EXERCISES
Decide whether each of the following statements is true or false. If a statement is false, explain
why.
1.1. To say that the null hypothesis is rejected does not necessarily mean it is false.
1.2. In a practical situation, the null hypothesis, alternative hypothesis, and level of rejection
should be speciﬁed before the experimentation.
1.3. The probability of choosing a random sample of 3 persons in which the ﬁrst 2 say “yes”
and the last person says “no” from a population in which P(yes) ¼ 0.7 is (0.7)(0.7)(0.3).
1.4. If the experimental hypothesis is true, chance does not enter into the outcome of the
experiment.
1.5. The alternative hypothesis is often the experimental hypothesis.
22
THE ROLE OF STATISTICS
1.6. A decision made on the basis of a statistical procedure will always be correct.
1.7. The probability of choosing a random sample of 3 persons in which exactly 2 say “yes”
from a population with P(yes) ¼ 0.6 is (0.6)(0.6)(0.4).
1.8. In the total process of investigating a question, the very ﬁrst thing a scientist does is
state the problem.
1.9. A scientist completes an experiment and then forms a hypothesis on the basis of the
results of the experiment.
1.10. In an experiment, the scientist should always collect as large an amount of data as is
humanly possible.
1.11. Even a specialist in a ﬁeld may not be capable of picking a sample that is truly
representative, so it is better to choose a random sample.
1.12. If in an experiment P(success) ¼ 1/3, then the odds against success are 3 to 1.
1.13. One of the main reasons for using random sampling is to ﬁnd the probability that an
experiment could yield a particular outcome by chance if the null hypothesis is true.
1.14. The a level in a statistical procedure depends on the ﬁeld of investigation, the cost, and
the seriousness of error; however, traditional levels are often used.
1.15. A conclusion reached on the basis of a correctly applied statistical procedure is based
solely on probability.
1.16. The null hypothesis may be the same as the experimental hypothesis.
1.17. The “a level” and the “region of rejection” are two expressions for the same thing.
1.18. If a correct statistical procedure is used, it is possible to reject a true null hypothesis.
1.19. The probability of rolling two 6’s on two dice is 1/6 þ 1/6 ¼ 1/3.
1.20. A weakness of many surveys is that there is little control of secondary variables.
SELECTED READINGS
Anscombe, F. J. (1960). Rejection of outliers. Technometrics, 2, 123–147.
Barnard, G. A. (1947). The meaning of a signiﬁcance level. Biometrika, 34, 179–182.
Barnett, V., and T. Lewis (2002). Outliers in Statistical Data, 3rd ed. Wiley, New York.
Bennett, J. O., W. L. Briggs, and M. F. Triola (2003). Statistical Reasoning for Everyday Life, 2nd ed.
Addison-Wesley, New York.
Berkson, J. (1942). Tests of signiﬁcance considered as evidence. Journal of the American Statistical
Association, 37, 325–335.
Box, G. E. P. (1976). Science and statistics. Journal of the American Statistical Association, 71, 791–799.
Cox, D. R. (1958). Planning of Experiments. Wiley, New York.
Duggan, T. J., and C. W. Dean (1968). Common misinterpretation of signiﬁcance levels in sociology
journals. American Sociologist, 3, 45 –46.
Edgington, E. S. (1966). Statistical inference and nonrandom samples. Psychological Bulletin, 66, 485–487.
Edwards, W. (1965). Tactical note on the relation between scientiﬁc and statistical hypotheses.
Psychological Bulletin, 63, 400–402.
Ehrenberg, A. S. C. (1982). Writing technical papers or reports. American Statistician, 36, 326 –329.
Gibbons, J. D., and J. W. Pratt (1975). P-values: Interpretation and methodology. American Statistician,
29, 20–25.
Gold, D. (1969). Statistical tests and substantive signiﬁcance. American Sociologist, 4, 42–46.
Greenberg, B. G. (1951). Why randomize? Biometrics, 7, 309 –322.
Johnson, R., and P. Kuby (2000). Elementary Statistics, 8th ed. Duxbury Press, Paciﬁc Grove, California.
SELECTED READINGS
23
Labovitz, S. (1968). Criteria for selecting a signiﬁcance level: A note on the sacredness of .05. American
Sociologist, 3, 220–222.
McGinnis, R. (1958). Randomization and inference in sociological research. American Sociological
Review, 23, 408 –414.
Meier, P. (1990). Polio trial: an early efﬁcient clinical trial. Statistics in Medicine, 9, 13–16.
Plutchik, R. (1974). Foundations of Experimental Research, 2nd ed. Harper & Row, New York.
Rosenberg, M. (1968). The Logic of Survey Analysis. Basic Books, New York.
Royall, R. M. (1986). The effect of sample size on the meaning of signiﬁcance tests. American
Statistician, 40, 313–315.
Selvin, H. C. (1957). A critique of tests of signiﬁcance in survey research. American Sociological Review,
22, 519–527.
Stigler, S. M. (1986). The History of Statistics. Harvard University Press, Cambridge.
2 Populations, Samples, and
Probability Distributions
In Chapter 1 we showed that statistics often plays a role in the scientiﬁc method; it is used to
make inference about some characteristic of a population that is of interest. In this chapter we
deﬁne some terms that are needed to explain more formally how inference is carried out in
various situations.
2.1. POPULATIONS AND SAMPLES
We use the term population rather broadly in research. A population is commonly understood
to be a natural, geographical, or political collection of people, animals, plants, or objects.
Some statisticians use the word in the more restricted sense of the set of measurements of
some attribute of such a collection; thus they might speak of “the population of heights of
male college students.” Or they might use the word to designate a set of categories of some
attribute of a collection, for example, “the population of religious afﬁliations of U.S.
government employees.”
In statistical discussions, we often refer to the physical collection of interest as well as to
the collection of measurements or categories derived from the physical collection. In order to
clarify which type of collection is being discussed, in this book we use the term population as
it is used by the research scientist: The population is the physical collection. The derived set of
measurements or categories is called the set of values of the variable of interest. Thus, in the
ﬁrst example above, we speak of “the set of all values of the variable height for the population
of male college students.”
This distinction may seem overly precise, but it is important because in a given research
situation more than one variable may be of interest in relation to the population under
consideration. For example, an economist might wish to learn about the economic condition
of Appalachian farmers. He ﬁrst deﬁnes the population. Involved in this is specifying the
geographical area “Appalachia” and deciding whether a “farmer” is the person who owns land
suitable for farming, the person who works on it, or the person who makes managerial
decisions about how the land is to be used. The economist’s decision depends on the group in
which he is interested. After he has speciﬁed the population, he must decide on the variable or
variables, that characteristic or set of characteristics of these people, that will give him
information about their economic condition. These characteristics might be money in savings
accounts, indebtedness in mortgages or farm loans, income derived from the sale of livestock,
or any of a number of other economic variables. The choice of variables will depend on the
objectives of his study, the speciﬁc questions he is trying to answer. The problem of choosing
Statistics for Research, Third Edition, Edited by Shirley Dowdy, Stanley Weardon, and Daniel Chilko.
ISBN 0-471-26735-X # 2004 John Wiley & Sons, Inc.
25
26
POPULATIONS, SAMPLES, AND PROBABILITY DISTRIBUTIONS
characteristics that pertain to an issue is not trivial and requires a great deal of insight and
experience in the relevant ﬁeld.
Once the population and the related variable or variables are speciﬁed, we must be careful
to restrict our conclusions to this population and these variables. For example, if the above
study reveals that Appalachian farm managers are heavily in debt, it cannot be inferred that
owners of Kansas wheat farms are carrying heavy mortgages. Nor if Appalachian farm
workers are underpaid can it be inferred that they are suffering from malnutrition, poor health,
or any other condition that was not directly measured in the study.
After we have deﬁned the population and the appropriate variable, we usually ﬁnd it
impractical, if not impossible, to observe all the values of the variable. For example, all the values
of the variable miles per gallon in city driving for this year’s model of a certain type of car could
not be obtained since some of the cars probably are yet to be produced. Even if they did exist, the
task of obtaining a measurement from each car is not feasible. In another example, the values of the
variable condition of all packaged bandages (sterile or contaminated) produced on a particular
day by a certain ﬁrm could be obtained, but this is not desirable since the bandages would be made
useless in the process of testing. Instead, we consider a sample (a portion of the population), obtain
measurements or observations from this sample (the sample data), and then use statistics to make
an inference about the entire set of values. To carry out this inference, the sample must be random.
We discussed the need for randomness in Chapter 1; in the next section we outline the mechanics.
EXERCISES
2.1.1. In each of the following examples identify the population, the sample, and the research
variable.
a. To determine the total amount of error in all students’ bills, a large university
selects 150 accounts for a special check of accuracy.
b. A wildlife biologist collects information on the sex of the 28 surviving California
condors.
c. An organic chemist repeats the synthesis of a certain compound 5 times using the
same procedure and each time determines the percentage of yield.
d. The Census Bureau distributes a special questionnaire to 1 out of every 20
households in the census and among other questions inquires about the number of
rooms in the dwelling.
e. A manufacturer examines the records of each of its employees to determine how
long each one has worked for the company.
2.1.2. Identify 3 different research variables that might be investigated for each of the
following populations.
a. All adults living in Colorado
b. All patients of a certain opthalmologist
c. All farms in Oklahoma
d. All veterans’ hospitals
2.1.3. For two years Francis Galton explored unmapped areas of South Africa. Thereafter, he
tried to explore unmapped areas of science. In both Africa and science, however, he
made some wrong turns. One of them was in the sampling procedure he used in his
study of the inheritance of genius. To simplify his study, he evaluated the number and
2.2. RANDOM SAMPLING
27
quality of academic, artistic, musical, and other worthy “abilities” a notable person
displayed in his life, and the variable of interest was the man’s score on the scale
Galton used (see Exercise 2.3.5). He would then examine the life of that man’s father
and score his abilities in the same fashion. After gathering data on …

Order your essay today and save 25% with the discount code: STUDYSAVE

Order Now

Turn in your highest-quality paper
Get a qualified writer to help you with

“ SAS & R Statistics 2 questions ”

Get high-quality paper

NEW! AI matching with writer

Order a unique copy of this paper

Type of paper needed:

Pages:

600 words

Academic level:

We'll send you the first draft for approval by September 11, 2018 at 10:52 AM

Total price:

$26

Our Services

SAS & R Statistics 2 questions

Order a unique copy of this paper