Psychological Statistics Worksheet

Advanced Psychological Statistics (Psych UA 11)Spring 2023
Data Assignment 2
For the following assignment, you will need to do analyses in R.
What you need to submit:
1. The code you used to complete your analyses
2. Your written response to each question
How should you write up the homework so we can grade it?
1. Some of you may have experience with RMarkdown (or maybe you want to learn how to use
RMarkdown!). If that’s the case, you can use RMarkdown to both run your code (in code
chunks) and write your responses (outside of those code chunks). Once you’re done, you
can knit (aka export) your RMarkdown file as a PDF and submit that.
2. The option most of you will probably use: If you’ve never heard of RMarkdown (or don’t
want to learn how to use it), it’s probably simpler to create run your analyses in R and then
write up the assignment in Microsoft Word, Google Docs, or whatever platform you use.
You can submit your code by either:
a. Taking screenshots of your code and pasting the images into your text document
b. Copying the code you used in R and directly pasting it into your text document
c. Submitting both a R Script file with your code + your text document
You will also need to paste the plots you created in R into your text document.
Whichever option you use, it is easiest for us to grade if you submit your write-up as a PDF.
Please help us save time grading by submitting it as a PDF (it opens directly in Brightspace as a PDF,
but Word documents don’t…). (R files are ok for your code!)
Do I need to write in full sentences? We aren’t going to grade you on your grammar, but when
asked to give your thoughts on something or to explain something in the data, please use full
sentences. You don’t have to write a lot, but full sentences make it clearer what you’re thinking.
How do I access R?
Here’s a guide on how to download R and RStudio onto your own computer.
You can also access R using NYU’s Virtual Lab or through a free account on RStudio Cloud (now
called posit cloud)—both great options, especially if you’re struggling to get R onto your computer!
I’m still confused on how to submit the homework, or I am having trouble with R! Please reach
out to your teaching assistant (they are all knowledgeable in how to use R!) or to me.
Question 1.
A researcher wants to know if there is a correlation between introspection and optimism. Using two
questionnaires, the researcher collects the data available in data2_q1.csv. In this dataset, there is a
column for `participant`, `introspection`, and `optimism`.
1. Compute Pearson’s r correlation for these data. (2 pts)
2. Create a graph to visualize introspection against optimism. Make sure you use an
appropriate plot to visualize your data, that the variables are on the correct axes, and that
the axes are correctly labeled. (3 pts)
3. Compute the linear model for the data. What are the slope and intercept? Write the equation
for the trendline. (5 pts)
4. Given the slope, what do you expect the trend of the data to be (e.g., “as X increases, y…”) (3
pts)
5. If “Introspection” has a value of 75, what would be the predicted value of “Optimism”? Show
your work. (3 pts)
Question 2.
You are asked to analyze data from a study that looked at the association between an
organizational self-rating (“how organized would you rate yourself?”) and social network (how
many “friends” would you say you currently have at this moment?”) for 15 people.
The data is available in APS_data2_q2.csv, and has three columns: `participant`, `org`
(Organizational Self Rating), and `friends`.
1. Run a correlation on the data. What’s Pearson’s r? (2 pts)
2. Interpret the data. What does the correlation suggest? Comment on both the strength and
direction of the relationship. (3 pts)
3. Create two new variables in the dataset that relist `org` and `friends` in rank order. You
can do this using the rank() function in R.
dataset$new_variable %
3
mutate(value = ifelse(is.na(value), 0,
4
5 ## I made a plot here for visualization
6 ggplot(meow_cor_data) +
7
geom_jitter(aes(x = age, y = value, col
8
alpha = .5) +
9
theme_classic() +
labs(x = “Age (in months)”,
10
11
y = “Does the child know the word
12
scale_y_continuous(breaks = c(0, 1),
13
labels = c(“No”, “Ye
14
guides(color = “none”)
15
https://moty.shinyapps.io/APS_Lab2/#section-running-correlations-and-linear-models-in-r
1/6
2023/3/3 20:08
Getting started with statistics in R, Part 2
Getting started with
statistics in R, Part 2
Before you get started
(https://moty.shinyapps.io/APS_Lab2
before-you-get-started)
What’s a package?
(https://moty.shinyapps.io/APS_Lab2
whats-a-package)
Pearson’s product-moment correlation
Importing data
(https://moty.shinyapps.io/APS_Lab2 data: meow_cor_data$value and meow_cor_da
importing-data)
ta$age
t = 34.401, df = 7599, p-value < 2.2e-16 Plotting data alternative hypothesis: true correlation i (https://moty.shinyapps.io/APS_Lab2 s not equal to 0 95 percent confidence interval: plotting-data) 0.347465 0.386372 sample estimates: What kind of plot should you use cor (https://moty.shinyapps.io/APS_Lab2 0.367079 what-kind-of-plot-should-youuse) The output has quite a few things in it, including things we haven’t talked about yet. Running correlations and linear What does the output tell you? models in R Tutorial by Kelsey Moty Start Over 1. The first thing the output does is specify the type of correlation you ran. 2. It tells you the data you ran the correlation on. 3. It provides the t-value, the degrees of freedom, and the p-value associated with the correlation test 4. It tells you what the alternative hypothesis was 5. It provides the values for the 95% confidence interval (lower and upper limits) 6. Finally, it tells you what the correlation actually was What was the value of the correlation from the test we ran above? ✗ .23 ✓ .37 ✗ .42 ✗ .49 https://moty.shinyapps.io/APS_Lab2/#section-running-correlations-and-linear-models-in-r 2/6 2023/3/3 20:08 Getting started with statistics in R, Part 2 Getting started with statistics in R, Part 2 Correct! Before you get started (https://moty.shinyapps.io/APS_Lab2 Now you try! I have a dataset called measurements that has two before-you-get-started) variables: weight and height . What’s a package? (https://moty.shinyapps.io/APS_Lab2 Run a correlation test to see if there is a correlation between participants’ weight and height. whats-a-package) R Code Start Over Hint Run Code Importing data (https://moty.shinyapps.io/APS_Lab2 2 filter(item_definition == "meow") %>%
3
mutate(value = ifelse(is.na(value), 0,
importing-data)
4
5 ggplot(meow_cor_data) +
6
geom_jitter(aes(x = age, y = value, col
Plotting data
alpha = .5) +
(https://moty.shinyapps.io/APS_Lab2 78 theme_classic()
+
plotting-data)
9
labs(x = “height”,
10
y = “weight”) +
scale_y_continuous(breaks = c(0, 1),
11
What kind of plot should you use
labels = c(“No”, “Ye
(https://moty.shinyapps.io/APS_Lab2 12
13
guides(color = “none”)
14
what-kind-of-plot-should-you15 # The actual cor.test
use)
16 cor.test(meow_cor_data$value, meow_cor_da
Running correlations and linear
models in R



Tutorial by Kelsey Moty
Start Over
Pearson’s product-moment correlation
data: meow_cor_data$value and meow_cor_da
ta$age
t = 34.401, df = 7599, p-value < 2.2e-16 alternative hypothesis: true correlation i s not equal to 0 95 percent confidence interval: 0.347465 0.386372 sample estimates: cor 0.367079 https://moty.shinyapps.io/APS_Lab2/#section-running-correlations-and-linear-models-in-r 3/6 2023/3/3 20:08 Getting started with statistics in R, Part 2 Getting started with statistics in R, Part 2 Linear models To run a linear model in R, we use the lm() function. Before you get started (https://moty.shinyapps.io/APS_Lab2 lm() requires to arguments: before-you-get-started) 1. the formula for the regression: y ~ x 2. the dataset you want to run the regression on What’s a package? (https://moty.shinyapps.io/APS_Lab2 whats-a-package) Importing data (https://moty.shinyapps.io/APS_Lab2 importing-data) Plotting data (https://moty.shinyapps.io/APS_Lab2 plotting-data) What kind of plot should you use (https://moty.shinyapps.io/APS_Lab2 what-kind-of-plot-should-youuse) Running correlations and linear models in R Tutorial by Kelsey Moty Start Over Important! The dependent variable goes to the left of the ~ and the independent variable(s) (aka the predictors) go to the right of the ~ R Code  Start Over  Run Code 1 2 meow_cor_data % filter(item_definition == "meow") %>%
3
4
mutate(value = ifelse(is.na(value), 0,
5
6 model1 |t|)
(Intercept) -0.02755
0.02425 -1.136
0.256
age
0.03476
0.00101 34.401
|t|) column)
4. A legend for the significance codes
5. The residual standard error (again, something
we often ignore)
6. The degrees of freedom
7. The R-squared values
8. An F-statistic testing for the significance of
the overall model
Before you get started
(https://moty.shinyapps.io/APS_Lab2
before-you-get-started)
What’s a package?
(https://moty.shinyapps.io/APS_Lab2
whats-a-package)
Importing data
(https://moty.shinyapps.io/APS_Lab2
importing-data)
Plotting data
(https://moty.shinyapps.io/APS_Lab2
plotting-data)
Does this linear model suggest that a child’s
age predicts whether they know the word
What kind of plot should you use
‘meow’ or not?
(https://moty.shinyapps.io/APS_Lab2
✓ yes
what-kind-of-plot-should-youuse)
✗ no
✗ we can’t determine either way from this
Running correlations and linear
model
models in R
Tutorial by Kelsey Moty
Start Over
Incorrect
Previous Topic
https://moty.shinyapps.io/APS_Lab2/#section-running-correlations-and-linear-models-in-r
5/6
2023/3/3 20:08
Getting started with statistics in R, Part 2
Getting started with
statistics in R, Part 2
Before you get started
(https://moty.shinyapps.io/APS_Lab2
before-you-get-started)
What’s a package?
(https://moty.shinyapps.io/APS_Lab2
whats-a-package)
Importing data
(https://moty.shinyapps.io/APS_Lab2
importing-data)
Plotting data
(https://moty.shinyapps.io/APS_Lab2
plotting-data)
What kind of plot should you use
(https://moty.shinyapps.io/APS_Lab2
what-kind-of-plot-should-youuse)
Running correlations and linear
models in R
Tutorial by Kelsey Moty
Start Over
https://moty.shinyapps.io/APS_Lab2/#section-running-correlations-and-linear-models-in-r
6/6
Data Assignment 1
Question1
a.
b.
c. The mean age is 25.73 years, while the median is 25, and mode is 24. The three measures of
central tendency are closely equal, implying that the possible distribution of age is normal.
Question2
a
There is a difference between mean (45.3) and median(40), and this is because there is a
presence of outliers or extreme values that could skew the results.
b. SD:
c.
27.86096
Now create a random sample of 100 numbers ranging from 1 to 100 (again, using the sample()
function with replace set to TRUE).
a. mean( 54.97), median (58.5), sd(27.73083)
b.
Values from the sample of 100 numbers appear to more random compared to the first sample of
10, in which I could discern a pattern. I expect their central measures to be equal because they
are both randomly generated, spanning from 1 to 100.
Question3.
a. The mean (559.53) and median (533) reaction time for candy present group is higher than for
candy absent (mean = 505.733, median = 523). These statistical results suggest that the presence
of a candy in the exam room had a negative impact on completion time. But, it is also possible
that participants, who given candy were more focused on the candy and less focused on the
assigned task.
b.
There is a significance increase in mean (new mean = 5152.133 ) for Candy absent group if the
we change the highest score in this group to be 10 times the original value. However, there is no
change in median, still 523.
c.
The same applies when we change the highest score for the present group, by a tenth. Mean
significantly reduces from 559.5333 to 517.9533, because of this change. This is because mean is
highly sensitive to outliers compared to median. The median changes from 533 to 510, a little
bit smaller compare to the mean.
Question4.
a.
There is a positive linear relationship. Meaning that students with high scores on pre-test are
likely to perform equally better in the post-test.
b.
The correlation coefficient of 0.5373 suggest moderate linear regression between this pair of
variables.
c. I think there is room of improvement for teacher as the selected technique had some flaws or
limitations. Firstly, the teacher only measured the scores of the students based on the two tests,
which is not sufficient to determine the technique was effective. This experiment does not have a
control group for determining whether the collected scores could have been influenced by other
factors, such as motivation, or exam preparation.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper
Order your essay today and save 25% with the discount code: STUDYSAVE

Order a unique copy of this paper

600 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
Top Academic Writers Ready to Help
with Your Research Proposal

Order your essay today and save 25% with the discount code GREEN