Analyzing Air Quality Data Using Distributed Cloud Services and Spark Paper

You may use any dataset of your choice as long as it has more than 1,000 observations with at
least 2 numeric predictors. This dataset should be relevant to your interest or research (
should
you pursue a thesis).
This project should have the following core components:
1)
A data lake or data warehouse – defend your choice. Why are you choosing to store your
data in one over the other
2)
Connect data lake or data warehouse to a distributed cloud service such as AWS,
AZURE, or GCP
3)
Run your Spark application over those distributed services
4)
Documentation – This is the most CRUCIAL STEP. The final product should be a report
(NO LESS
than 3 pages), detailing the steps you took to get to the results, and what the
final results are.
5)
This report should also include an explanation of your goal, datasets used, and the
technical approach to get to the end result.
6)
THINK OF IT THIS WAY: I should be able to follow the approaches from your report and
replicate whatever you did.

Criteria:
1- Includes data storage
2- Includes distributed cloud service
3- Includes final report

Order your essay today and save 25% with the discount code: STUDYSAVE

Order Now

Turn in your highest-quality paper
Get a qualified writer to help you with

“ Analyzing Air Quality Data Using Distributed Cloud Services and Spark Paper ”

Get high-quality paper

NEW! AI matching with writer

Order a unique copy of this paper

Type of paper needed:

Pages:

600 words

Academic level:

We'll send you the first draft for approval by September 11, 2018 at 10:52 AM

Total price:

$26

Our Services

Analyzing Air Quality Data Using Distributed Cloud Services and Spark Paper

Order a unique copy of this paper