welcome/
java-mcmc/
software/
papers/
links/
email me

XML-SED

NAME
SYNOPSIS
DESCRIPTION
OPTIONS
EDITING COMMANDS
EXIT STATUS
EXAMPLE
BUGS
AUTHORS
SEE ALSO

NAME

xml-sed − stream editor for filtering and transforming an XML file.

SYNOPSIS

xml-sed

[OPTION]... SCRIPT [ [FILE] [:XPATH]... ]

DESCRIPTION

xml-sed is a stream editor for XML files similar to sed(1). It can be used to perform basic text and structural transformations on an input FILE or the standard input, with the results printed on the standard output. xml-sed only makes a single pass through the input, and is particularly suited for filtering XML documents in a pipeline.

xml-sed operates by first splitting its input into echo-leaves, which are described in xml-coreutils(7) and should be each thought of as analogous to a single line of text processed by sed(1). As each echo-leaf is read, it is placed in the pattern space, where editing commands can operate. Once all editing has taken place, the pattern space is printed in the manner of xml-echo(1), thus converting the (now modified) echo-leaf into a real XML fragment on the standard output. At this point, the next echo-leaf is read and the cycle is repeated.

If one or more XPATH(s) are present, then editing will only be performed on the selected nodes, ie unselected nodes cannot be modified. The case where no XPATH is present implicitly selects all nodes for editing.

You may find the following command useful initially to better understand this process:

xml-sed --unecho ’’ myfile

Each editing command in xml-sed consists of an (optional) address followed by a command code. If no address is present, then this command operates on each echo-leaf in turn, otherwise it operates only on the echo-leaves which match the address.

The simplest address is a numerical range which applies to the input ordering of the echo-leaves. The --unecho switch shows the required ordinal numbers.

The single most useful command is the ’s’ command, which substitutes text within the echo-leaf. The syntax of the ’s’ command is identical to the ’s’ command of sed(1), but extends the available flags slightly, to better adapt to the added structure of an echo-leaf.

OPTIONS

--unecho

Print the pattern space as an XML comment, just before the echo-leaf is output, but after all editing commands have been performed. In addition to the pattern space, the comment also includes the echo-leaf number. Moreover, the selection status (if the node is selected by an XPATH) is marked with a star.

EDITING COMMANDS

Each editing command is optionally preceded by an address, which can take the form NUM or NUM1,NUM2. In the first case, the command is executed when the echo-leaf number is NUM, whereas in the latter case the command is executed for all echo-leaves numbered NUM1 (incl.) to NUM2 (excl.). The special symbol "$" represents the number infinity. If no address is specified, the command applies to all echo-leaves in the input. If a block is preceded by an address, then that address is used as a default for all commands within the block.
#comment

The comment extends until the end of the line.

{

Begin a block of commands. Must end with }.

}

End a block of commands.

a text

Append text after echo-leaf. Embedded newlines must be quoted with backslash.

c text

Replace echo-leaf with text. Embedded newlines must be quoted with backslash.

i text

Insert text before echo-leaf. Embedded newlines must be quoted with backslash.

b label

Branch to label, or end of script if no label provided.

: label

Define label for b and t commands.

t label

If a ’s’ command has successfully substituted a pattern in the pattern space, branch to label.

d

Delete the whole pattern space and start next command cycle from the beginning.

D

Delete the first echo-leaf in the pattern space and start next command cycle.

h

Copy pattern space into hold space.

H

Append pattern space into hold space.

g

Copy hold space into pattern space.

G

Append hold space into pattern space.

l

Print the current pattern space as an echo-leaf wrapped in an XML comment.

n

Read the next echo-leaf into the pattern space.

N

Append the next echo-leaf into the pattern space.

p

Print the current pattern space.

P

Print the first echo-leaf contained in the pattern space.

q

Quit xml-sed, closing all open tags.

s/regex/replacement/flags

If regex matches within the pattern space, substitute the replacement. With no flags, the match always skips the PATH section of the first echo-leaf. With the x flag, the match applies only to the PATH section of the first echo-leaf. With the z flag, the match applies to the full pattern space.

x

Swap the hold and pattern spaces.

y/source/dest/

Transliterate the characters in source with the corresponding characters in dest.

EXIT STATUS

xml-sed returns 0 on success, or 1 otherwise.

EXAMPLE

Replace a tag name:

cat notebook.xml | xml-sed ’s/root/book/x’

BUGS

The current version of xml-sed doesn’t handle doctypes or processing instructions.

AUTHORS

Laird A. Breyer is the original author of this software. The source code (GPLv3 or later) for the latest version is available at the following locations:
http://www.lbreyer.com/gpl.html
http://xml-coreutils.sourceforge.net

SEE ALSO

xml-coreutils(7)