com.topologi.diffx.xml.esc
Interface XMLEscape

All Known Implementing Classes:
XMLEscapeASCII, XMLEscapeUTF8

public interface XMLEscape

An interface to escape XML character data.

This interface assumes that the values to be escapes do not orignate from XML text, in order words, there should not be already any entity or markup in the document. If it is the case the methods in this class should also escapes them. Thus "&" would be represented as "&".

Also the method will not try to escape characters that cannot be escaped.

This class is mostly concerned about producing well formed XML and not does attempt to produce valid data.

Version:
0.7.7
Author:
Christophe Lauret
See Also:
Extensible Markup Language (XML) 1.0

Method Summary
 String getEncoding()
          Returns the encoding used by the implementing class.
 String toAttributeValue(char[] ch, int off, int len)
          Returns a well-formed attribute value.
 String toAttributeValue(String value)
          Returns a well-formed attribute value.
 String toElementText(char[] ch, int off, int len)
          Writes a well-formed XML literal text value.
 String toElementText(String value)
          Returns a well-formed text value for the element.
 

Method Detail

toAttributeValue

String toAttributeValue(char[] ch,
                        int off,
                        int len)
Returns a well-formed attribute value.

This method must replace any character in the specified value by the corresponding numeric character reference or the predefined XML general entities, if the character is not allowed or not in the encoding range.

Attribute values must not contain any ampersand (#x26) or less than (#x3C) characters. This method will replace them by the corresponding named entity.

Quotes and apostrophes must also be escaped depending on what was used in the attribute markup. Since this method is not aware of which type of quotes was used, both are escaped. Double quotes (#x22) are escaped using a named character entity. In case the end result is HTML 4, single quotes (#x27) are escaped using a numeric character entity.

Characters in ranges (#x00-#x1F) and (#x80-#x9F) are silently ignored except for line feed (#x0A), carriage return (#x0D) and tab (#x09).

Parameters:
ch - The value that needs to be attribute-escaped.
off - The start (offset) of the characters.
len - The length of characters to.
Returns:
A well-formed value for the attribute.
See Also:
Extensible Markup Language (XML) 1.0 - 2.3 Common Syntactic Constructs

toAttributeValue

String toAttributeValue(String value)
Returns a well-formed attribute value.

Method provided for convenience, using the same specifications as toAttributeValue(char[], int, int).

This method should return null if the given value is null.

Parameters:
value - The value that needs to be attribute-escaped.
Returns:
A well-formed value for the attribute.

toElementText

String toElementText(char[] ch,
                     int off,
                     int len)
Writes a well-formed XML literal text value.

This method must replace any character in the specified text by the corresponding numeric character reference or the predefined XML general entities, if the character is not allowed or not in the encoding range.

Literal text values must not contain any 'ampersand' (#x26) or 'less than' (#x3C) characters. This method will replace them by the corresponding named entity.

Out of precaution this method may also encode the 'greater than' (#xCE) character, in case it follows "]]".

Characters in ranges (#x00-#x1F) and (#x80-#x9F) are silently ignored except for line feed (#x0A), carriage return (#x0D) and tab (#x09).

Parameters:
ch - The value that needs to be attribute-escaped.
off - The start (offset) of the characters.
len - The length of characters to.
Returns:
A well-formed value for the text node.
See Also:
Extensible Markup Language (XML) 1.0 - 2.4 Character Data and Markup

toElementText

String toElementText(String value)
Returns a well-formed text value for the element.

Method provided for convenience, using the same specifications as toElementText(char[], int, int).

This method should return null if the given value is null.

Parameters:
value - The value that needs to be text-escaped.
Returns:
A well-formed value for the text node.

getEncoding

String getEncoding()
Returns the encoding used by the implementing class.

Returns:
The encoding used by the implementing class.