Tools Home : XML Tools : Extract from XML

Click here to show HTML tools HTML Tools

Click here to expand XML tools XML tools

Click here to expand plain text tools Plain Text Tools

Click here to expand other tools Other tools

 Beta tools
 Add Tools Demo
 Manual
 About

Extract Text from XML Document
?
Summary

This tool displays all text found within specified elements in an XML document, either located at a specified web address or in a document uploaded from the user’s files.

Please click the ? buttons at the bottom right of each set of options for more information on that set.

For further information on this tool, please see the TADA Wiki’s Extract from XML entry here. A glossary of terms is also available here.

Walkthrough

To extract and list text found between <para> and </para> from http://www.xml.com/1999/03/ie5/first-x.xml and display it within a new HTML page:
    1. Enter 'http://www.xml.com/1999/03/ie5/first-x.xml' in the URL field.
  1. Subtext limited to
    1. Enter ‘para’ in the ‘Element’ field.
  2. Results
    1. Select HTML in the Display as drop-down menu.
  3. Click the ‘Submit’ button to process the text.
*
» Source text
  Example: http://taporware.ualberta.ca/sampleDocs/interact2.xml

?
Summary

This section determines the source of the document you wish the tool to process. XML can be obtained either from a web address or by uploading a file.

Fields

Source URL
To use content from a web page, enter a full web address (URL) ending in .html in the field provided. Copy and paste from your browser’s address bar for best results.

Local file
To upload an XML (.xml) file from your computer, choose ‘Local file,’ click ‘Browse,’ and select the file you wish to use from your directory.
» Subtext limited to
?
Summary

This section determines which elements to extract text from. Users can also specify the attribute name or name/value pair modifying the element.

Fields

Elements
Determines which XML elements to extract text from. Multiple tags must be separated by commas. This field cannot be left blank.

Attribute name
Use this field to specify an attribute name that modifies the element listed above. Only instances the element with this attribute will have text extracted from them.

Attribute value
Use this field to specify an attribute value modifying the element and attribute name listed above. The attribute value must be paired with an attribute name. Only instances the element with this attribute name/value pair modifying it will have text extracted from them.
» Results
?
Summary

This section allows users to choose how the results will be formatted, and whether to display it in a new browser window.

Fields

Display as
This drop-down lists enables users to choose from several output formats: HTML, XML text in HTML, XML tree, and Tab delimited text.

Display words before and after the pattern
Check this box to display the words surrounding the pattern. Note: Concordance only.

Open results in new window
Checking this box will display the results in a new window. This option is selected by default. Some pop-up blockers may prevent a new window from being opened; if so, un-check the box to open the results in the same window instead.
`*' indicates a required field

 

 

TAPoRware Project, McMaster University,