Analyzing Air Quality Data Using Distributed Cloud Services and Spark Paper

You may use any dataset of your choice as long as it has more than 1,000 observations with at
least 2 numeric predictors. This dataset should be relevant to your interest or research (
should
you pursue a thesis).
This project should have the following core components:
1)
A data lake or data warehouse – defend your choice. Why are you choosing to store your
data in one over the other
2)
Connect data lake or data warehouse to a distributed cloud service such as AWS,
AZURE, or GCP
3)
Run your Spark application over those distributed services
4)
Documentation – This is the most CRUCIAL STEP. The final product should be a report
(NO LESS
than 3 pages), detailing the steps you took to get to the results, and what the
final results are.
5)
This report should also include an explanation of your goal, datasets used, and the
technical approach to get to the end result.
6)
THINK OF IT THIS WAY: I should be able to follow the approaches from your report and
replicate whatever you did.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Criteria:
1- Includes data storage
2- Includes distributed cloud service
3- Includes final report

Order a unique copy of this paper

600 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
Top Academic Writers Ready to Help
with Your Research Proposal

Order your essay today and save 25% with the discount code GREEN