Tools Home : Other Tools : Weighted Centroid

Click here to show HTML tools HTML Tools

Click here to expand XML tools XML tools

Click here to expand plain text tools Plain Text Tools

Click here to expand other tools Other tools

 Beta tools
 Add Tools Demo
 Manual
 About

Weighted Centroid
?
Summary

This tool generates a circular graph of word distribution data from a source document in a Java applet, either from a URL or uploaded from the user's files. The results can be limited by number of high frequency words to include, specifying words to include, or applying TAPoR's modified Glasgow Stop Words list.

Note 1: If an HTML or XML text is submitted, the tool will strip all tags and process it as plain text.

Note 2: This tool requires JavaScript to view the Java applet.

Please click the ? buttons at the bottom right of each set of options for more information on that set.

For further information on this tool, please see the TADA Wiki's Weighted Centroid entry here. A glossary of terms is also available here.

Walkthrough To show the top 20 high frequency words from http://tada.mcmaster.ca/wikita/pub/Main/ToolTestingTexts/GulliversTravels.txt, show their distribution over chunks of 10% of the text, exclude words on the modified Glasgow Stop Words list and display the results as a Java applet:
  1. Source text
    1. Enter `http://tada.mcmaster.ca/wikita/pub/Main/ToolTestingTexts/GulliversTravels.txt' in the 'URL' field;
  2. Subtext limited to
    1. Click the radio button next to '__% blocks of text' and select '10' from the drop menu.
  3. Words limited to
    1. Choose '20' from the 'Top __ high frequency words' drop menu.
    2. Check the box next to 'Exclude modified Glasgow Stop Words'.
  4. Results
    1. Choose 'Java applet' from the 'Display as' drop menu.
    2. Click the 'Submit' button to process the text.
*
» Source text
  Example: http://taporware.ualberta.ca/sampleDocs/plainText.txt


?
Summary

This section determines the source of the document you wish the tool to process.

Fields

Source URL
To use content from a web page, enter a full web address (URL) in the field provided. Copy and paste from your browser’s address bar for best results. If the web address directs to an HTML or XML document instead of plain text, the tool will strip all tags and process it as plain text.

Local file
To upload a plain text (.txt) file from your computer, choose ‘Local file,’ click ‘Browse,’ and select the file you wish to use from your directory.

» Subtext limited to


?
Summary

This section determines the size of the units to break the text into, and therefore the granularity of the graph.

Fields

Paragraphs
Choose this option to treat each paragraph of the text as a chunk.

__% block of text
Choose this option to split the text into chunks by percentage.

Chunks of ___ words
Choose this option to define chunks by the number of words included in each.
» Words limited to

     
     
                                                              
?
Summary

This section allows the user to limit the words included in the final graph.

Fields

Top __ high frequency words
Choose the number of high frequency words to inlcude in the graph from the options in this drop box

Exclude modified Glasgow Stop Words
Click this option to filter TAPoR's modified Glasgow Stop Words list out of the final results

Words in addition to top words
Use this field to enter a custom list of words to include in the final list. Separate multiple words with a comma (Ex: red, green, orange, purple).
» Results
?
Summary

This section allows users to choose how the results will be formatted, and whether to display it in a new browser window.

Display as
This drop-down lists enables users to display the results as a Java applet, HTML table or tab delimited text.

Open results in new window
Check this box to display the results in a new window or browser tab. This option is selected by default. Some pop-up blockers may prevent a new window from being opened; if so, un-check the box to open the results in the same window instead.
`*' indicates a required field

 

 

TAPoRware Project, McMaster University,