CSS 300 SXU Data Cleaning Process Lab

CSS 300 Module 2 Activity WorksheetUse this worksheet to complete your lab activity. Submit it to the applicable assignment
submission folder when complete.
Deliverable:
– A word document summarizing the following steps
Using the dataset ramen-ratings.csv dataset
1. Import the dataset.
2. Get to know the dataset by using the following code samples. Do any of these hint at
something being incorrect with the dataset? If so, explain.
df.head()
df.dtypes
df.info()
3. Get the shape of the dataset:
df.shape
4. Remove rows where at least one missing values is found using the following code
sample:
dfNew = df.dropna(subset=[“Review Number”, “Brand”, “Variety”,
“Style”, “Country”, “Stars”])
5. Get the shape of the dataset post drops of nulls
6. Remove duplicate records using the following code sample:
df.drop_duplicates(keep = False, inplace = True)
7. Get the shape of the dataset post drops of duplicates

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Order a unique copy of this paper

600 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
Top Academic Writers Ready to Help
with Your Research Proposal

Order your essay today and save 25% with the discount code GREEN