Arrays and DataFrames Lab Report

lab01October 11, 2023
1
Lab 1: Arrays and DataFrames
1.1
Due Thursday, October 12th at 11:59PM
Welcome to Lab 1! This week, we’ll learn about arrays, which allow us to store sequences of data,
and DataFrames, which let us work with multiple arrays of data about the same things. These
topics are covered in BPD 7-10 in the babypandas notes. You should complete this entire lab so
that all tests pass and submit it to Gradescope by 11:59PM on the due date.
Please do not use for-loops for any questions in this lab. If you don’t know what a for-loop
is, don’t worry – we haven’t covered them yet. But if you do know what they are and are wondering
why it’s not OK to use them, it is because loops in Python are slow, and looping over arrays and
DataFrames should usually be avoided.
First, set up the imports we’ll need by running the cell below.
[1]: import numpy as np
import babypandas as bpd
import otter
grader = otter.Notebook()
2
1. Arrays
Computers are most useful when you can use a small amount of code to do the same action to
many different things.
For example, in the time it takes you to calculate the 18% tip on a restaurant bill, a laptop can
calculate 18% tips for every restaurant bill paid by every human on Earth that day. (That is, if
you’re pretty fast at doing arithmetic in your head!)
Arrays are how we put many values in one place so that we can operate on them as a group. For
example, if billions_of_numbers is an array of numbers, the expression
0.18 * billions_of_numbers
evaluates to a new array of numbers that’s the result of multiplying each number in
billions_of_numbers by 0.18 (18%). Arrays are not limited to numbers; we can also put all
the words in a book into an array of strings.
1
Concretely, an array is a collection of values of the same type, like a column in a spreadsheet
(think Google Sheets or Microsoft Excel).
2.1
1.1. Making arrays
You can type in the data that goes in an array yourself, but that’s not typically how we’ll create
arrays. Normally, we create arrays by loading them from an external source, like a data file.
First, though, let’s learn how to do it the hard way. To begin, we can make a list of numbers by
putting them within square brackets and separating them by commas:
[2]: my_list = [14, -2.26, 0.15]
my_list
[2]: [14, -2.26, 0.15]
Just like int, float, and str, the list is a data type provided by Python. Lists are very flexible
and easy to work with, but they are slowwww �.
As data scientists, we’ll often be working with millions or even billions of numbers. For this, we
need something faster than a list. Instead of lists, we will use arrays.
Arrays are provided by a package called NumPy (pronounced “NUM-pie” or, if you prefer to
pronounce things incorrectly, “NUM-pee”). The package is called numpy, but it’s standard to
rename it np for brevity. You can do that with:
import numpy as np
Data scientists, as well as engineers and scientists of all kinds, use numpy frequently, and you’ll see
quite a bit of it if you’re a data science major.
[3]: import numpy as np
Now, to create an array, call the function np.array with a list of numbers. Run this cell to see an
example:
[4]: np.array([14, -2.26, 0.15])
[4]: array([14.
, -2.26,
0.15])
Note that you need the square-brackets here. If you were to try running the following code, Python
would yell at you because you forgot them:
np.array(14, -2.26, 0.15)
Arrays themselves are also values, just like numbers and strings. That means you can assign them
names or use them as arguments to functions.
Question 1.1.1. Make an array containing the numbers 2, 4, and 6, in that order. Name it
even_numbers.
[5]: even_numbers = np.array([2, 4, 6])
even_numbers
2
[5]: array([2, 4, 6])
[6]: grader.check(“q1_1_1”)
[6]: q1_1_1 results: All test cases passed!
Question 1.1.2. Make an array containing the numbers 0, -1, 1, 𝜋, and 𝑒, in that order. Name it
odd_numbers.
Hint: 𝜋 and 𝑒 are available from the np module, which has already been imported. Just as you
used math.pi to get 𝜋 in the last lab, you can use np.pi to get 𝜋 as well. Do not import the math
module.
[7]: odd_numbers = np.array([0, -1, 1, np.pi, np.e])
odd_numbers
[7]: array([ 0.
, -1.
,
1.
,
3.14159265,
2.71828183])
[8]: grader.check(“q1_1_2”)
[8]: q1_1_2 results: All test cases passed!
Question 1.1.3. Make an array containing the five strings “Hello”, “,”, ” “, “world”, and “!”.
(The third one is a single space inside quotes.) Name it hello_world_components.
Note: If you print hello_world_components, you’ll notice some extra information in addition to
its contents: dtype=’

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Order a unique copy of this paper

600 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
Top Academic Writers Ready to Help
with Your Research Proposal

Order your essay today and save 25% with the discount code GREEN