IT 445 SEU The Nature and Structure of The Iris Dataset Project

• You must submit two separate copies (one Word file and one PDF file) using the Assignment
Template on Blackboard via the allocated folder. These files must not be in compressed format.
• It is your responsibility to check and make sure that you have uploaded both the correct files.
• Zero mark will be given if you try to bypass the SafeAssign (e.g., misspell words, remove spaces
between words, hide characters, use different character sets, convert text into image or languages
other than English or any kind of manipulation).
• Email submission will not be accepted.
• You are advised to make your work clear and well-presented. This includes filling your information
on the cover page.
• You must use this template, failing which will result in zero mark.
• You MUST show all your work, and text must not be converted into an image, unless specified
otherwise by the question.
• Late submission will result in ZERO mark.
• The work should be your own, copying from students or other resources will result in ZERO mark.
• Use Times New Roman font for all your answers.
Students can form groups consisting of three students and send their names to their
instructors before 5th October 2023. Otherwise, the instructors will form the groups
randomly and assign any datasets to the groups.
Select one dataset from the datasets provided in the bellow link.
For 28 Data Analysis Projects to Boost Your Skills [2023 Guide]:
For more free public datasets for EDA:
✓ After the dataset is selected (or assigned), analyze the data using Microsoft
Excel to discover the structure of data, trends, patterns, or any anomalies in the
data based on your hypothesis.
✓ Perform the following six tasks.
✓ You should use visualization to aid your answers.
Your project will include two main parts:
1. The final project report must incorporate all the following 6 tasks and be written
using the provided template. (10 marks distributed among the below tasks).
2. A presentation that illustrates your 6 tasks completed in the project. (4 marks)
Task 1: Understand and describe the nature and structure of the selected dataset.
(2 marks)

Describe the dataset. Your description should answer the following
questions: is it reliable? how was it collected? What is its size?

Identify the features of the dataset.

Propose hypothesis/assumptions (between 2 numerical variables) to
Task 2: Check if your selected features have any of the following issues. Describe how
you conducted the tests and how you addressed the issues. Support your answers with
screenshots of the issues before and after the fixes. (1 mark)

Missing values (0.25 for the test, fix, and screenshot)

Duplicate values (0.25)

Data outliers (0.25)

Any noise or irregularities (0.25)
Task 3: Provide descriptive statistics for the selected features using statistical methods
to understand the dataset more and answer the following analysis questions: (2 marks)

Include any of the measures of central tendency such as the mean, median,
and mode.

Describe the spread of your data. This may include the measure of variance,
standard deviation, skewness, and kurtosis.
(You are encouraged to impose other analysis questions based on any trend you
notice in the dataset).
Task 4: Validate the hypothesis in Task 1 by investigating the relationship between
two quantitative variables you have chosen using correlation, regression, and R-squared
with possible conclusions. (2 marks)
Task 5: Show a visual representation of your analysis (hint: use the right chart/graph
for your data analysis). (1 mark)
Task 6: Build an active Dashboard that summarizes the most crucial factors (variables)
that will help in the decision-making process, and then demonstrate the effectiveness
of your selection of those factors in the decision-making process. (2 marks)

Order your essay today and save 25% with the discount code GREEN