Tools Home : XML Tools : Find Text -- Co-occurrences

Click here to show HTML tools HTML Tools

Click here to expand XML tools XML tools

Click here to expand plain text tools Plain Text Tools

Click here to expand other tools Other tools

 Beta tools
 Add Tools Demo

Find Co-occurring Words

This tool searches for two words at a user-defined distance of words, sentences, lines or paragraphs apart from one another, within an XML (.xml) document located either at a user-specified web address or uploaded from the user’s files. If desired, the results can be narrowed to include words only found within certain tags.

Please click the ? buttons at the bottom right of each set of options for more information on that set.

For further information on this tool, please see the TADA Wiki's Co-ocurrance entry here. A glossary of terms is also available here.


To generate co-occurrences for text found within; extract text found between the <para> and </para> tags, limit the results to strings of text that contain both 'Microsoft' and 'CSS' within ten words of one another, and display the results as HTML:
  1. Source text
    1. Enter ‘’ into the ‘URL’ field.
  2. Subtext limited to
    1. Enter ‘para’ in the ‘Elements’ field, leaving ‘Attribute name’ and ‘Attribute value’ blank.
  3. What to find
    1. Enter ‘Microsoft’ into the ‘Primary pattern’ field.
    2. Enter ‘CSS’ into the ‘Co-pattern’ field.
  4. Context for concordance
    1. Set the ‘Context’ drop list to ‘Words’.
    2. Enter ‘10’ into the ‘Context length’ field.
  5. Results
    1. Set the ‘Display as’ field to ‘HTML’.
  6. Click the ‘Submit’ button to process the text.
» Source text


This section determines the source of the document you wish the tool to process. XML can be obtained either from a web address or by uploading a file.


Source URL
To use content from a web page, enter a full web address (URL) ending in .html in the field provided. Copy and paste from your browser’s address bar for best results.

Local file
To upload an XML (.xml) file from your computer, choose ‘Local file,’ click ‘Browse,’ and select the file you wish to use from your directory.
» Subtext limited to

This section determines which elements to extract text from. Users can also specify the attribute name or name/value pair modifying the element.


Determines which XML elements to extract text from. Multiple tags must be separated by commas. This field cannot be left blank.

Attribute name
Use this field to specify an attribute name that modifies the element listed above. Only instances the element with this attribute will have text extracted from them.

Attribute value
Use this field to specify an attribute value modifying the element and attribute name listed above. The attribute value must be paired with an attribute name. Only instances the element with this attribute name/value pair modifying it will have text extracted from them.
» What to find
(use `,' as delimiter)

This section defines the primary and secondary words or patterns to search for within the source text. A pattern may be a string of characters or a regular expression. The tool will search for the primary and secondary words or patterns within a set distance of one another.


Primary pattern
Enter the first word or pattern here.

Enter the second word or pattern here.
» Context for concordance


This section allows the user to define the context type and how many of that type to show on either side of each instance of the word or pattern. Users can either ignore XML elements or use them as part of the context.


Ignore elements
Choose this option to remove all XML elements from the results and just display the text within them.

Context can be set to one of three options from the drop menu: by words, lines, or sentences.

Context length
Enter the number of words, lines or sentences to include before and after the word/pattern.

Use elements
Choose this option to include XML elements in the results.

Closest Element
Click the radio button to show the XML element in which the text is found.

Surrounding Element
This option allows users to specify a particular XML element as the context for the word or pattern.
» Results

This section allows users to choose how the results will be formatted, and whether to display it in a new browser window.


Display as
This drop-down lists enables users to choose from several output formats: HTML, XML text in HTML, XML tree, and Tab delimited text.

Open results in new window
Checking this box will display the results in a new window. This option is selected by default. Some pop-up blockers may prevent a new window from being opened; if so, un-check the box to open the results in the same window instead.
`*' indicates a required field



TAPoRware Project, McMaster University,