Tools Home : Other Tools : Aggregator

Click here to show HTML tools HTML Tools

Click here to expand XML tools XML tools

Click here to expand plain text tools Plain Text Tools

Click here to expand other tools Other tools

 Beta tools
 Add Tools Demo
 Manual
 About

Aggregate text from different sources
?
Summary

This tool is used to aggregate text from multiple documents, either located at a specified web address or uploaded from the user’s files, into a single document.

Please click the ? buttons at the bottom right of each set of options for more information on that set.

For further information on this tool, please see the TADA Wiki's Aggregator entry here. A glossary of terms is also available here.

Walkthrough

To create an aggregate text from http://www.ucc.ie/celt/published/E850003-017/text001.html, http://www.ucc.ie/celt/published/E850003-104/text001.html and http://www.ucc.ie/celt/published/E726000-001.html, strip tags from the source texts and display the resulting document as plain text:
  1. Source text
    1. Enter http://www.ucc.ie/celt/published/E850003-017/text001.html, http://www.ucc.ie/celt/published/E850003-104/text001.html and http://www.ucc.ie/celt/published/E726000-001.html into the ‘URL(s)’ field, ensuring each URL is on its own line.
  2. How to handle markup
    1. Click the radio button next to 'Strip tags'.
  3. Results
    1. Choose ‘Plain text’ from the ‘Display as’ drop menu.
  4. Click the ‘Submit’ button to process the text.
*
» Source text


(list of URLs)

» Plus
?
Summary

This section determines the source of the documents to aggregate into a single text. Documents may be plain text (.txt), HTML (.html) or XML (.xml).

Fields

Source text
Determines the texts that will be used in the aggregate.

URL(s)
Enter a full web address (URL) in the field provided for each text to aggregate, ensuring that each URL is on its own line. Copy and paste from your browser’s address bar for best results.

Local file with list of links
To upload a text file containing URLs from your computer, choose ‘Local file,’ click ‘Browse,’ and select the file you wish to use from your directory. Note: Each URL must be on its own line within the source document.

Plus
Use this option to include an additional file from your local computer in the aggregate. Click 'Browse' and select the file from your directory.
» How to handle markup


?
Summary

This section determines how the tool will handle XML and HTML markup in the source documents.

Fields

Strip tags
This option removes all tagging from the source documents. It can be used with HTML and XML texts.

Smart strip selected tags to form XML Corpus
This option strips XML tags from the source documents and creates an XML Corpus. Any non-XML documents will become comments.
» Results
?
Summary

This section allows the user to choose the format of the aggregated text.

Fields

Display as
This drop-down lists enables users to choose from several output formats: HTML, XML text in HTML, XML tree, and Tab delimited text. Note: XML outputs are not available for the Find Dates tool.

Display words before and after the pattern
Check this box to show the words that appear before and after the pattern in the results Note: Concordance only.

Open results in new window
Check this box to display the results in a new window or browser tab. This option is selected by default. Some pop-up blockers may prevent a new window from being opened; if so, un-check the box to open the results in the same window instead.
`*' indicates a required field

 

 

TAPoRware Project, McMaster University,