Tools Home : Other Tools : Weighted Centroid

Click here to show HTML tools HTML Tools

Click here to expand XML tools XML tools

Click here to expand plain text tools Plain Text Tools

Click here to expand other tools Other tools

 Beta tools
 Add Tools Demo
 Manual
 About

Weighted Centroid
?
Summary

The Weighted Centroid java applet displays a circular graph based on word distribution data.

The text is divided up into an arbitrary number of units, which are positioned around the circumference of the circle in a clockwise sequence. The more times a word appears in a particular text unit, the closer the word will be to that unit in the circle. If a word appears an equal number of times in all units, it be located in the centre of the circle.

For more information, please click here.

Walkthrough

Example: fetch plain text from http://taporware.ualberta.ca/sampleDocs/plainText.txt; use a subtext of 10% blocks of text; limit the results to the top 20 words.
  1. Source text
    1. Enter `http://taporware.ualberta.ca/sampleDocs/plainText.txt' in the URL field;
  2. Subtext limited to
    1. Select the % block of text option and select 10 from the drop down.
  3. Words limited to
    1. Select 20 in the high frequency words drop down and check Exclude glasgow stop list.
  4. Results
    1. Select Java applet from the Display drop down.
*
» Source text
  Example: http://taporware.ualberta.ca/sampleDocs/plainText.txt


?
Summary

Determines the text source. Text can be obtained from a URL or by uploading a file.

Fields

Source URL
Text from the entered URL will be used as the data source for the analysis.

Local file
Use this field to upload a local file for analysis.

Treat XML/HTML as plain text
Enabling this option will strip tags from an HTML or XML document. <p> and <br /> in HTML documents and all tags in XML documents are converted to new lines (i.e. \n).
» Subtext limited to


?
Summary

Defines the granularity of the graph, i.e. how many chunks the results will be divided into.

Fields

Paragraphs
Instructs the tool to use paragraphs as chunks. (Note: the number of paragraphs should not more than 150)

n% block of text
When n=1 the tool will divide the text into chunks which equal 1% of the total document size.

Chunks of n words
Defines the number of words per chunk used by the tool.
» Words limited to

     
     
                                                              
?
Summary

Determine what words will appear in the results.

Fields

Top n high frequency words
Select how many of the highest frequency words you want to include in the results. If you select 0, then you need to supply your own words below.

Exclude glasgow stop list
The glasgow stop list is a list of common words which you might want to exclude from your results.

Words in addition to top words
List any additional words you want to appear in the results. Words must be separated by commas, e.g. where, when, how, why.
» Results
?
Summary

Allows the user to choose how the results will be formatted and whether they should be displayed in a new browser window.

Fields

Display as
Determines the format in which results will be delivered

Open results in new window
Checking this box will display the results in a new window. This option is selected by default. In some cases pop-up blockers may disallow windows from being created, in which case this option may be de-selected.
`*' indicates a required field

 

 

TAPoRware Project, McMaster University,