PM 510 Homework 5
Problems not marked with [SPSS] should be done by hand, as problems similar to that could be
on the Midterm and/or Final. (Of course, feel free to check your work using SPSS!)
Problems marked with [SPSS] are intended to be done with SPSS. For these problems, please
provide the output file.
1. Suppose that you have a set of 12 measurements, 6 drawn from one population and 6 from
another, with each measurement in one population corresponding to one in the other
population. You wish to use the Wilcoxon Signed Rank Test to evaluate the hypothesis that
the population median of the differences is equal to 0.
(a) Suppose that the sum of the positive ranks is 15 and the sum of the negative ranks is –6.
What is the p-value from the normal approximation of the two-sided test?
(Note that the normal approximation of the Wilcoxon Signed Rank Test test statistics is
typically recommended when the sample size is larger than 12. The normal approximation
in this question is for practice purposes.)
(b) How does your answer change if you are doing a one-sided test instead? Do you have
sufficient information to answer this question?
2. Suppose you have a Wilcoxon Signed Rank Test with n = 25 and T = 23 and that T comes from
the positive ranks. Calculate the p-value from the normal approximation in each of the
following cases.
(a) H0: Md = M0 vs. H1: Md ≠ M0
(b) H0: Md ≤ M0 vs. H1: Md > M0
(c) H0: Md ≥ M0 vs. H1: Md < M0
3. Suppose you have a Wilcoxon Rank Sum Test with n1 = 10, n2 = 11, and W = 83.5. Calculate
the p-value from the normal approximation.
(a) H0: M1 = M2 vs. H1: M1 ≠ M2
4. The table below contains resting energy expenditure (REE) data for patients with cystic fibrosis
(CF) and healthy individuals matched to the patients on age, sex, height, and weight. Using
the Wilcoxon Signed Rank Test, test whether the median of the differences between the two
populations is different from 0. Calculate the p-value from the normal approximation.
REE (kcal/day)
Pair CF Healthy
1 1153
996
2 1132 1080
3 1165 1182
4 1460 1460
5 1634 1162
6 1493 1619
7 1358 1140
8 1453 1123
9 1185 1113
10 1824 1824
11 1793 1632
12 1930 1614
13 2075 1836
5. The characteristics of low birth weight children dying of sudden infant death syndrome were
examined for both females and males. The ages in days at time of death for samples of 11 girls
and 16 boys are in the table below.
Age (days)
Females
Males
53
20 115
56
21 133
60
24 134
60
46 167
78
52 175
87
58
102
59
117
77
134
78
160
103
277
114
You believe the population median ages are different for boys and girls. Use the Wilcoxon
Rank Sum Test to test this hypothesis. Calculate the p-value from the normal approximation.
What do you conclude?
6. [SPSS] The numbers of community hospital beds per 1000 population that are available in
each state and the District of Columbia are saved in the data set bed.sav. The values for 1980
are saved under the variable name bed80, and those for 1986 are saved under bed86. The data
set bed2.sav contains the same data in a different format: the number of beds per 1000
population for both calendar years are saved under the variable name bed, and an indicator of
year under the name year.
(a) Construct a pair of box-and-whisker plots for the numbers of community hospital beds per
1000 population in 1980 and in 1986.
(b) Use the Wilcoxon Signed Rank Test to determine whether the median difference in the
number of beds is equal to 0.
Hint: Use bed.sav
(c) Compare the median number of hospital beds in 1980 to the median number in 1986 using
the Wilcoxon Rank Sum Test.
Hint: Use bed2.sav
(d) Comment on the differences between the two Wilcoxon tests. Do you reach the same
conclusion in each case? Which was the appropriate test to use in this situation?
(e) Analyze the data using a paired t-test and an independent sample t-test. Compare the results
you obtain to the nonparametric results from above.
7. [SPSS] A study was conducted to determine whether women who do not have health insurance
coverage are less likely to be screened for breast cancer, and whether their disease is more
advanced at the time of diagnosis. The medical records for a sample of women who were
privately insured and for a sample who were uninsured were examined. The stage of breast
cancer at diagnosis was assigned a number between 1 and 5, where 1 denotes the least advanced
disease and 5 the most advanced. The relevant observations are saved in the file insure.sav;
the stage of the disease is saved under the variable name stage, and an indicator of the group
status—which takes the value 0 for women who were uninsured and 1 for those who were
privately insured—under the name group.
(a) Could the two-sample t-test be used to analyze these data (comparing stage according to
the insurance status)? Why or why not?
(b) Test the null hypothesis that the median stage of cancer for the privately insured women is
identical to the median stage of cancer for the uninsured women.
(c) Do these data suggest that uninsured women have more advanced disease than insured
women at the time of diagnosis?