Analyzing Air Quality Data Using Distributed Cloud Services and Spark Paper

You may use any dataset of your choice as long as it has more than 1,000 observations with at
least 2 numeric predictors. This dataset should be relevant to your interest or research (
you pursue a thesis).
This project should have the following core components:
A data lake or data warehouse – defend your choice. Why are you choosing to store your
data in one over the other
Connect data lake or data warehouse to a distributed cloud service such as AWS,
Run your Spark application over those distributed services
Documentation – This is the most CRUCIAL STEP. The final product should be a report
than 3 pages), detailing the steps you took to get to the results, and what the
final results are.
This report should also include an explanation of your goal, datasets used, and the
technical approach to get to the end result.
THINK OF IT THIS WAY: I should be able to follow the approaches from your report and
replicate whatever you did.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

1- Includes data storage
2- Includes distributed cloud service
3- Includes final report

Order your essay today and save 25% with the discount code: STUDYSAVE

Order a unique copy of this paper

600 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
Top Academic Writers Ready to Help
with Your Research Proposal

Order your essay today and save 25% with the discount code GREEN