com.topologi.diffx.xml.esc
Class XMLEscapeWriterUTF8

java.lang.Object
  extended by com.topologi.diffx.xml.esc.XMLEscapeWriterUTF8
All Implemented Interfaces:
XMLEscapeWriter

public final class XMLEscapeWriterUTF8
extends Object
implements XMLEscapeWriter

A utility class for escaping XML data using the UTF-8 encoding.

Only characters which must be escaped are escaped since the Unicode Transformation Format should support all Unicode code points.

Escape methods in this class will escape non-BMP character for better compatibility with storage mechanism which do not support them, for example some databases.

Version:
0.7.8
Author:
Christophe Lauret

Constructor Summary
XMLEscapeWriterUTF8(Writer writer)
          Creates a new XML escape writer using the utf-8 encoding.
 
Method Summary
 String getEncoding()
          Returns the encoding for this writer.
 void writeAttValue(char[] ch, int off, int len)
          Writes a well-formed attribute value.
 void writeAttValue(String value)
          Default implementation calling the XMLEscapeWriter.writeAttValue(char[], int, int).
 void writeText(char c)
          Writes the character so that the text value for the element remains well-formed.
 void writeText(char[] ch, int off, int len)
          Writes a well-formed XML literal text value.
 void writeText(String value)
          Default implementation calling the XMLEscapeWriter.writeAttValue(char[], int, int).
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface com.topologi.diffx.xml.esc.XMLEscapeWriter
getEncoding, writeAttValue, writeText
 

Constructor Detail

XMLEscapeWriterUTF8

public XMLEscapeWriterUTF8(Writer writer)
                    throws NullPointerException
Creates a new XML escape writer using the utf-8 encoding.

Parameters:
writer - The writer to wrap.
Throws:
NullPointerException - if the writer is null.
Method Detail

writeAttValue

public void writeAttValue(char[] ch,
                          int off,
                          int len)
                   throws IOException
Description copied from interface: XMLEscapeWriter
Writes a well-formed attribute value.

This method must replace any character in the specified value by the corresponding numeric character reference or the predefined XML general entities, if the character is not allowed or not in the encoding range.

Attribute values must not contain any ampersand (#x26) or less than (#x3C) characters. This method will replace them by the corresponding named entity.

Quotes and apostrophes must also be escaped depending on what was used in the attribute markup. Since this method is not aware of which type of quotes was used, both are escaped. Double quotes (#x22) are escaped using a named character entity. In case the end result is HTML 4, single quotes (#x27) are escaped using a numeric character entity.

Characters in ranges (#x00-#x1F) and (#x80-#x9F) are silently ignored except for line feed (#x0A), carriage return (#x0D) and tab (#x09).

Specified by:
writeAttValue in interface XMLEscapeWriter
Parameters:
ch - The value that needs to be attribute-escaped.
off - The start (offset) of the characters.
len - The length of characters to.
Throws:
IOException - If thrown by the underlying writer.
See Also:
Extensible Markup Language (XML) 1.0 - 2.3 Common Syntactic Constructs

writeText

public void writeText(char[] ch,
                      int off,
                      int len)
               throws IOException
Description copied from interface: XMLEscapeWriter
Writes a well-formed XML literal text value.

This method must replace any character in the specified text by the corresponding numeric character reference or the predefined XML general entities, if the character is not allowed or not in the encoding range.

Literal text values must not contain any 'ampersand' (#x26) or 'less than' (#x3C) characters. This method will replace them by the corresponding named entity.

Out of precaution this method may also encode the 'greater than' (#xCE) character, in case it follows "]]".

Characters in ranges (#x00-#x1F) and (#x80-#x9F) are silently ignored except for line feed (#x0A), carriage return (#x0D) and tab (#x09).

Specified by:
writeText in interface XMLEscapeWriter
Parameters:
ch - The value that needs to be attribute-escaped.
off - The start (offset) of the characters.
len - The length of characters to.
Throws:
IOException - If thrown by the underlying writer.
See Also:
Extensible Markup Language (XML) 1.0 - 2.4 Character Data and Markup

writeText

public void writeText(char c)
               throws IOException
Description copied from interface: XMLEscapeWriter
Writes the character so that the text value for the element remains well-formed.

Some implementations may unable to deal with java characters outside the Basic Multilingual Plane (BMP). As a result, java characters which correspond to UTF-16 surrogate pairs (#xD800 - 0xDFFF) in may be not be handled appropriately.

Unicode Transformation Format (UTF) implementation should copy the java character verbatim.

Specified by:
writeText in interface XMLEscapeWriter
Parameters:
c - The character that needs to be text-escaped.
Throws:
IOException - If thrown by the underlying writer.

writeAttValue

public final void writeAttValue(String value)
                         throws IOException
Default implementation calling the XMLEscapeWriter.writeAttValue(char[], int, int). Writes a well-formed attribute value.

Method provided for convenience, using the same specifications as XMLEscapeWriter.writeAttValue(char[], int, int).

Specified by:
writeAttValue in interface XMLEscapeWriter
Parameters:
value - The value that needs to be attribute-escaped.
Throws:
IOException - If thrown by the underlying writer.

writeText

public final void writeText(String value)
                     throws IOException
Default implementation calling the XMLEscapeWriter.writeAttValue(char[], int, int). Writes the text string so that the text value for the element remains well-formed.

Method provided for convenience, using the same specifications as XMLEscapeWriter.writeText(char[], int, int).

This method should do nothing if the given value is null.

Specified by:
writeText in interface XMLEscapeWriter
Parameters:
value - The text that needs to be text-escaped.
Throws:
IOException - If thrown by the underlying writer.

getEncoding

public final String getEncoding()
Returns the encoding for this writer. Returns the encoding used by the implementing class.

Specified by:
getEncoding in interface XMLEscapeWriter
Returns:
The encoding used by the implementing class.