Assignment 4
PS 3780 Data Literacy & Visualization, Spring 2023
Due Date: Friday, February 24, 2023 at 11:59 p.m.
Please save your answer to these questions as one .pdf le (use the save as
function in most word processors). Be sure to include your name, your teammate’s name
if there is anyone, and the assignment number. Submit the le to Carmen by the due
date.
Basics of R
CIA Factbook Data
Use the CIA World Factbook country comparison guide to download a numeric .csv
dataset CIA Factbook. Import the dataset into R. Please answer the following questions
with R and
copy the commands that you use for answering each question.
1. (.5 pt) What dataset did you download and what is the stored name of the dataset
in R?
2. (.5 pt) What is the average value of your chosen variable?
What is the median
value of your chosen variable?
3. (.5 pt) Does that average value happen to be the actual value of any country?
4. (.5 pt) Does that median value happen to be the actual value of any country?
5. (.5 pt) Which country has the lowest value?
6. (.5 pt) Which country is ranked 10th, 30th, and 50th respectively?
7. (1 pt) Which country ranks higher in the variable that you choose, Namibia or
Botswana (the data might be missing in your dataset, but at least you need to
write down the R command that you use for inquiry)?
Presidential Approval Advanced
Visit 538 to nd data on the popularity of Joe Biden through the rst term of his presidency. At the bottom of their interactive page, 538 approval, there is a link to download
the associated poll list. Import the dataset into R.
to answer the following questions.
1
Copy the commands that you use
1. (1 pt) Describe the relative approval of President Biden using the following steps:
Using approve and disapprove, create a new variable in the dataset named net
measuring the dierence of approve to disapprove (subtract the variables).
Using approve and disapprove, create a new variable in the dataset named
ratio measuring the ratio of approve to disapprove (divide the variables).
What is the average of net and ratio?
2. (1 pt) What was the value of net and ratio (two variables you just created) for
the polls that had the largest and smallest sample size?
3. (1 pt) What is the correlation between net and sample size? What is the correlation
between ratio and sample size? Are these correlations what we would expected
given the results from question 2?
Ie. Is the net and ratio large, smaller, or about the same for the largest and
smallest sample size? Does the found correlation (positive, negative, or weak)
suggest greater net and ratio for large or small sample sizes?
2