Strategic Information System Data Analysis and Knowledge Discovery Worksheet

Unlimited Attempts Allowed

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Details

The midterm assessment includes two sections.

Here is the completion :

Section I: Data analysis and Knowledge discovery

Steps to do for Section 1.

  • Choose a dataset of interest ( either from Kaggle or the existing datasets offered in the course material). Your dataset should not be similar to any of the previous assignments. Please don’t use the datasets you used for doing your assignments.
  • Create some research questions (goals)
  • Use at least seven topics to apply a good quality data analysis and knowledge discovery task to the dataset.
  • Range Names, Lookup Functions, Index Functions, Match Functions, Using Pivot Tables and slicers to describe data
  • The Data Model, Filtering data and removing duplicates, Array Formulas, and Functions
  • Text Functions, Date and Time Functions, Time & time functions
  • Summarizing Data by Using Histograms and Pareto Charts,
  • Importing Data from a Text File or Document,
  • Validating Data Reading
  • I am giving you the freedom to choose either of the functions according to your needs. The more functions you use, the more insights you get about a dataset.

    Save Time On Research and Writing
    Hire a Pro to Write You a 100% Plagiarism-Free Paper.
    Get My Paper

    4. Write a report using the template we have used for the two previous assignments.

    Template for reports.docx

    Download Template for reports.docx

    5. Make a PowerPoint presentation using the template I created for you. Here it is   

    INFO5810Midterm_powerpoint_presentation-1.pptx

    Download INFO5810Midterm_powerpoint_presentation-1.pptx

    6. if you don’t have one already, create a free Zoom account to record your video:

    https://zoom.us/signup.  (Links to an external site.)

    Present and record your work in Zoom. Share your screen ( the PowerPoint slides) while recording your presentation. The record should not exceed 10 minutes. 5 points will be deducted if that happens. Make sure to save the link to the presentation because you need it for submission. When recording, you can enable captioning so that while you talk, zoom creates captions for your talk. There are many tutorials about recording a video while sharing your screen and captioning in Zoom. Here is one, but you can search more if you still have conditions about it:

    https://www.uhd.edu/computing/services-training/training/Documents/zoom_record_presentations.pdf (Links to an external site.)

    What you need to submit to the Canvas as your midterm assessment:

    1.  Your report( using the template you used in the two last assignments).

    2.  Your Excel sheets

    3. The PowerPoint presentation (using the PowerPoint template I created for you).

    4. Link to the Zoom link of the recorded presentation; copy the link into the Canvas comment related to that submission. Unlimited Attempts Allowed

    DetailsThe midterm assessment includes two sections.Here is the completion :

    Section I: Data analysis and Knowledge discovery

    Steps to do for Section 1.

  • Choose a dataset of interest ( either from Kaggle or the existing datasets offered in the course material). Your dataset should not be similar to any of the previous assignments. Please don’t use the datasets you used for doing your assignments.
  • Create some research questions (goals)Use at least seven topics to apply a good quality data analysis and knowledge discovery task to the dataset.Range Names, Lookup Functions, Index Functions, Match Functions, Using Pivot Tables and slicers to describe dataThe Data Model, Filtering data and removing duplicates, Array Formulas, and FunctionsText Functions, Date and Time Functions, Time & time functionsSummarizing Data by Using Histograms and Pareto Charts,Importing Data from a Text File or Document,Validating Data Reading

    I am giving you the freedom to choose either of the functions according to your needs. The more functions you use, the more insights you get about a dataset.

    4. Write a report using the template we have used for the two previous assignments. Template for reports.docx

    Download Template for reports.docx

    5. Make a PowerPoint presentation using the template I created for you. Here it is   INFO5810Midterm_powerpoint_presentation-1.pptx

    Download INFO5810Midterm_powerpoint_presentation-1.pptx

    6. if you don’t have one already, create a free Zoom account to record your video: https://zoom.us/signup.  (Links to an external site.)Present and record your work in Zoom. Share your screen ( the PowerPoint slides) while recording your presentation. The record should not exceed 10 minutes. 5 points will be deducted if that happens. Make sure to save the link to the presentation because you need it for submission. When recording, you can enable captioning so that while you talk, zoom creates captions for your talk. There are many tutorials about recording a video while sharing your screen and captioning in Zoom. Here is one, but you can search more if you still have conditions about it:https://www.uhd.edu/computing/services-training/training/Documents/zoom_record_presentations.pdf (Links to an external site.) INFO 5810 Fall 2020
    Instructor: Rob Arao
    Mid Term Evaluation
    1
    Outline
    1. Project summary
    2. Data
    3. Research questions ( goals)
    4. Feature selection( Attributes)
    5. Pre-processing
    6. Tools and functions used for data analysis/visualization
    7. Results and visualization
    8. Validation
    9. Interpretation
    10. Conclusion
    2
    1. Project Summary
    3
    2. Data
    4
    3. Research questions ( Goals)
    5
    4. Feature selections ( Attributes)
    6
    5. Pre-processing
    7
    6. Tools and functions used for data analysis/visualization
    8
    7. Results and visualizations
    9
    7. Validation
    10
    8. Interpretation
    11
    8. Conclusion
    12
    Questions ?
    13
    Thank you
    Your full name
    email
    Affiliation
    14
    Student Full name
    Data Analysis and Knowledge discovery
    Section 203 or 001
    Assignment #
    Title
    Student Full Name
    Instructor Full Name
    Data Analysis and Knowledge Discovery, INFO 5810, Section number (/001 or 203)
    Assignment number #
    Date
    This is a template that I created according to my interviews for internships, domain knowledge,
    and experience. If you cannot cover something under any of the sections, i.e., it is not related to
    your dataset and analysis, mention that and explain why it is not useful in your case.
    1. Introduction ( 10 points)
    Explain what this report is all about. A few sentences are enough. It gives a general idea to the
    reader. For example, I would write something like this; This is a report for the class Data Analysis
    and Knowledge discovery. It is about (this will be unique for every report as datasets are varied).
    The report includes Data, Methods, Discussion, conclusion sections.
    1.1.
    Data
    The following features should be covered in this section;
    – what type of data is that? (To find the type, you can check my slides for the first meeting in Fall
    2020, which is on the canvas. )
    – what is your data about? For example, is it about business, Computer, social medial? (specify the
    domain)
    – Where did you get your dataset (its source)
    – what is the time interval for the collected data? The date that the data is being collected.
    – Basic statistics about the data set ( e.g., Mean, Median, Mode, Standard deviation, missing values,
    Nan)
    – how is the data being collected? ( manually? Automatically? if automatically, then using what
    programming tool? )
    – Is your data unstructured? If so, why? What are the specific things that make it to be called an
    unstructured dataset?
    – what are the attributes of your dataset?
    – are the attributes normalized? i.e., is the unit for all of them the same? For example, if you have
    data about the features of different apartment buildings for a loan company, and the features of
    your dataset are the number of rooms in the apartment and the headquarter of the whole apartment,
    then you can say these two attributes do not have the same unite, and so they are not normalized. (
    You will be learning about the normalization task more in the Machine learning class)
    – why did you choose this dataset? (What is interesting about this dataset and its domain).
    1.2. Substantive context, background, or framing issues with regard to this dataset

    Has anyone else ever worked on this dataset ( in Kaggle or other websites? )? If so, then give
    reference to that and introduce it here.
    Is there any relevant publication to this kind of dataset? If so, what is that? Explain it in a few
    words. ( there might be thousands of papers out there; you only need to talk about the most
    recent ones or the most popular ones. One or two papers will be enough.
    2.2. Objectives

    The questions that you think you can answer using this dataset. List the questions.
    Make sure that you can answer them using data analysis and modeling methods in the
    assignments.
    2

    Be aware that in the following sections, you are going to deal with these questions, and by the
    end of this report, you are going to show me you have mapped every single question to one
    analysis, result, and method.
    2. Methods( 10 points)
    2.1.Tools


    Which tool are you using? ( we are using Excel and Rapidminer in this class). These two tools
    are great for learning the fundamental of data analysis and text mining to be able to extract
    knowledge from the data.
    JFYI, python programming is a general, demanding, and well-known tool in the industry that
    needs a whole separate class and semester to be thought. You might consider that in your degree
    plan as it is offered by both IS and CS departments, as far as I know.
    2.2. Data pre-processing

    Mention specific steps in this part
    Mention and explain the functions, tools that you have done toward data-preprocessing.
    This part is a highly data-related task. Remember, the pre-processing of each type of dataset is
    unique. For example, text pre-processing is different from images or digits. So, specifically,
    explain which one you are dealing with, and if you do not have any pre-processing, explain
    why?
    2.3. Data Analysis
    This section is the core part that resembles your technical skills to an instructor or an employer who
    is interviewing you for a job.
    – What is the name of the analysis, model, or function used in this assignment? ( For example,
    you have used count, pitot table / Hloockup, Vlookup, or anything that has helped frequency
    occurrences from one attribute of your dataset.
    – Why did you choose this method? Why do you think that this model or analysis can help you
    to find answers to your questions in section one?
    – What are the basic concepts of this method of analysis or modeling?
    – Be specific; how did you do that analysis? Using which tools? Which functions or libraries in
    that tool? For example, the tool can be Excel, and the function can be Vlookup.
    3. Results (10 points)
    – Mention the specific results that you have gotten by doing that analysis method.
    -Be specific in mentioning the results with respect to the objectives of the work ( section1). Try to
    map each part of the result to a question that you have asked in section one.
    – Provide figures /tables ( can be either in the word format or screenshot of them that are taken from
    your tools, excel, or Rapidminer).
    – Make sure to label the figures and tables, caption them. For example, you can write; figure 1. A
    demonstration of the percentage of the …..
    And then, in the body of the related paragraph, you can write, figure one illustrates the result for
    this part.
    – Use colors for better visualizations
    – You are welcome to use any visualization tool such as Tableau. It is not required, though.
    3
    4. Discussion (10 points)

    In here, you can resemble your critical thinking. It is the knowledge extraction section.
    Now that you got the results, what kind of information you get from that result?
    Are you able to connect that with your research questions? If so, how, write it down.
    Is there any limitation in your work, data, report, method, and /or any specific part of
    this project? If so, what are they, and how do you think that you can improve them in
    the future?
    5. Evaluation and conclusion (10 points)

    What did you learn from this assignment?
    What was complicated about it? Which parts of the assignment?
    If you are asked to design this assignment, what would you add or delete from it?
    Was the assignment practical enough to help you gain some technical skills? If not, what
    method of analysis using which tools you wished you do instead?
    References

    Simply, provide references for every single resource that you use.
    I have seen some high similarity percentages in the first assignment.
    Be aware that I can track your assignments and find the exact resource that you
    are copying from, even if it is a student paper from previous years.
    4
    Question 1
    Question 2)

    Order a unique copy of this paper

    600 words
    We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
    Total price:
    $26
    Top Academic Writers Ready to Help
    with Your Research Proposal

    Order your essay today and save 25% with the discount code GREEN