Colombo Americano Text Miming Essay

In this assignment you will need to find an article you have never read on a subject that interests you using Galileo. You will use two free browser-based text mining tools to analyze the text in the article that you choose. (Please do this assignment related to sports)

Task 1. Visualizing Word Frequency. Looking at the frequency of occurrence of each word can give you an overall sense of a document. You will use your article to create a word cloud to visualize frequency and understand how stop words affect word frequency analysis. We will use


for this.

Task 2. Taking a deeper look at word frequency and occurrence. For the second text document we will use

Voyant Tools

to explore other visualization methods as well as look at word correlations. You will be adding screenshots and answering the questions on the template.Steps for finding an article:

1. go on gallileo.

2. enter a topic about sports that interests you in the search box. click the ‘Full text only’ box. hit search.

3. Find a Scholarly article between 2-4 pages long and download the pdf full text

steps for task 1:

Open the following website: WordClouds

2. Click the Wizard to get a walk through for how to use the software. To create a Word Cloud, you will upload your pdf. Click File -> New Word Cloud -> Blank Canvas. Click Word List -> Extract Words from pdf document and select the document you downloaded from Galileo. Leave all settings alone and click apply. It will be set to ‘Ignore stop words’ and ‘Ignore word case’ by default.

3. Personalize your Word Cloud by using different Theme, Shape, Gap, Font or other options. Once you are happy with the look, take a screenshot and add it to the template in the first section for ‘with stop words removed.’

4. Stop Words are small, common connector words that are frequent in English, but don’t usually convey much meaning. Let’s see how the word cloud changes if we remove these from the results. We need to recreate the word cloud. Click Word List -> Extract Words from pdf document and select the document you downloaded from Galileo. This time, uncheck the box for ‘Ignore stop words’ and click apply. The word cloud will change (if it doesn’t you’ll need to repeat the steps to make sure you didn’t use the wrong setting). Use the same personalization as you used for the first word cloud, take a screenshot and add it to the template in the second section for ‘without stop words removed.’

5. Answer the questions in the Word document for Task 1.

Steps for task 2:

1. Open the following website:

2. Click upload and select the pdf of the article you downloaded from Galileo.

3. You will see data in five different windows and each window has a tab.

a. In the upper left window click ‘Links’

b. In the upper middle window click ‘TermsBerry’

c. In the upper right window, leave it on ‘Trends’ (or click it if it’s not there)

d. In the bottom left window, leave it on ‘Summary’ (or click it if not there)

e. In the bottom right window, click ‘Correlations’

4. Take a screenshot of the entire screen and add it to your Word document

5. Answer the questions in the Word document for Task 2.

Upload the following:

1. The Word document template with your screen shot and answered questions.

2. The pdf of the article you downloaded from Galileo.

