User:Scsbot/xmlsed
Jump to navigation
Jump to search
xmlsed is (intended to be) a simple, general-purpose tool for parsing, analyzing, extracting data from, and modifying XML, HTML, and SGML files.
It is a work in progress -- it is not finished or complete. As of this writing, it has only a couple of features, just those needed by "wikised", the bot script run by User:Scsbot. But for this application (its only application so far) it works just fine.
(Yes, I know, I should have used an off-the-shelf XML tool, such as XSLT or Xerces, to perform these tasks, rather than reinventing the wheel. But the off-the-shelf tools I've looked at are Just Too Complicated.)
Invocation:
xmlsed [flags] inputfile [tag]
At this stage there are only two useful option flags:
- -t
- print a "table of contents" of the input file, showing the nesting structure of the tags. Also, each tag is given a unique identifying number.
- -x
- extract the contents of the requested tag. Tags can be identified in two ways: by their path, or by the unique identifying number listed by -t. Tag attributes can be extracted as well, using the syntax path/@tag or #uniqueid/@tag.
Source code: ftp://ftp.eskimo.com/u/s/scs/src/xmlsed.tar.gz