com.fasterxml.aalto.async
Class AsyncUtfScanner

java.lang.Object
  extended by com.fasterxml.aalto.in.XmlScanner
      extended by com.fasterxml.aalto.in.ByteBasedScanner
          extended by com.fasterxml.aalto.async.AsyncByteScanner
              extended by com.fasterxml.aalto.async.AsyncUtfScanner
All Implemented Interfaces:
XmlConsts, NamespaceContext, XMLStreamConstants

public class AsyncUtfScanner
extends AsyncByteScanner

This class handles parsing of UTF-8 encoded XML streams, as well as other UTF-8 compatible (subset) encodings (specifically, Latin1 and US-ASCII).


Field Summary
protected  boolean _inDtdDeclaration
          Flag that indicates whether we are inside a declaration during parsing of internal DTD subset.
 
Fields inherited from class com.fasterxml.aalto.async.AsyncByteScanner
_currQuad, _currQuadBytes, _elemAllNsBound, _elemAttrCount, _elemAttrName, _elemAttrPtr, _elemAttrQuote, _elemNsPtr, _endOfInput, _entityValue, _inputBuffer, _nextEvent, _origBufferLen, _pendingInput, _quadCount, _state, _surroundingEvent
 
Fields inherited from class com.fasterxml.aalto.in.ByteBasedScanner
_charTypes, _inputEnd, _inputPtr, _pastBytes, _quadBuffer, _rowStartOffset, _symbols, _tmpChar, BYTE_a, BYTE_A, BYTE_AMP, BYTE_APOS, BYTE_C, BYTE_CR, BYTE_D, BYTE_EQ, BYTE_EXCL, BYTE_g, BYTE_GT, BYTE_HASH, BYTE_HYPHEN, BYTE_l, BYTE_LBRACKET, BYTE_LF, BYTE_LT, BYTE_m, BYTE_NULL, BYTE_o, BYTE_p, BYTE_P, BYTE_q, BYTE_QMARK, BYTE_QUOT, BYTE_RBRACKET, BYTE_s, BYTE_S, BYTE_SEMICOLON, BYTE_SLASH, BYTE_SPACE, BYTE_t, BYTE_T, BYTE_TAB, BYTE_u, BYTE_x
 
Fields inherited from class com.fasterxml.aalto.in.XmlScanner
_attrCollector, _attrCount, _cfgCoalescing, _cfgLazyParsing, _config, _currElem, _currNsCount, _currRow, _currToken, _defaultNs, _depth, _entityPending, _isEmptyTag, _lastNsContext, _lastNsDecl, _nameBuffer, _nsBindingCache, _nsBindingCount, _nsBindings, _nsBindMisses, _publicId, _systemId, _textBuilder, _tokenIncomplete, _tokenName, _xml11, CDATA_STR, INT_0, INT_9, INT_a, INT_A, INT_AMP, INT_APOS, INT_COLON, INT_CR, INT_EQ, INT_EXCL, INT_f, INT_F, INT_GT, INT_HYPHEN, INT_LBRACKET, INT_LF, INT_LT, INT_NULL, INT_QMARK, INT_QUOTE, INT_RBRACKET, INT_SLASH, INT_SPACE, INT_TAB, INT_z, MAX_UNICODE_CHAR, TOKEN_EOI
 
Fields inherited from interface com.fasterxml.aalto.util.XmlConsts
CHAR_CR, CHAR_LF, CHAR_NULL, CHAR_SPACE, STAX_DEFAULT_OUTPUT_ENCODING, STAX_DEFAULT_OUTPUT_VERSION, XML_DECL_KW_ENCODING, XML_DECL_KW_STANDALONE, XML_DECL_KW_VERSION, XML_SA_NO, XML_SA_YES, XML_V_10, XML_V_10_STR, XML_V_11, XML_V_11_STR, XML_V_UNKNOWN
 
Fields inherited from interface javax.xml.stream.XMLStreamConstants
ATTRIBUTE, CDATA, CHARACTERS, COMMENT, DTD, END_DOCUMENT, END_ELEMENT, ENTITY_DECLARATION, ENTITY_REFERENCE, NAMESPACE, NOTATION_DECLARATION, PROCESSING_INSTRUCTION, SPACE, START_DOCUMENT, START_ELEMENT
 
Constructor Summary
AsyncUtfScanner(ReaderConfig cfg)
           
 
Method Summary
protected  PName addPName(int hash, int[] quads, int qlen, int lastQuadBytes)
           
protected  int decodeUtf8_2(int c)
           Note: caller must guarantee enough data is available before calling the method
protected  int decodeUtf8_3(int c1)
           Note: caller must guarantee enough data is available before calling the method
protected  int decodeUtf8_3(int c1, int c2, int c3)
           
protected  int decodeUtf8_4(int c)
           
protected  int decodeUtf8_4(int c1, int c2, int c3, int c4)
           
protected  void finishCharacters()
          This method only gets called in non-coalescing mode; and if so, needs to parse as many characters of the current text segment from the current input block as possible.
protected  int finishCharactersCoalescing()
           
protected  boolean handleAttrValue()
           
protected  int handleCDataPending()
           
protected  int handleCommentPending()
           
protected  int handleDecEntityInCharacters(int ptr)
           
protected  boolean handleDTDInternalSubset(boolean init)
           
protected  int handleEntityInAttributeValue()
          Method called to handle entity encountered inside attribute value.
protected  int handleEntityInCharacters()
          Method called to handle entity encountered inside CHARACTERS segment, when trying to complete a non-coalescing text segment.
protected  int handleHexEntityInCharacters(int ptr)
           
protected  boolean handleNsDecl()
           
protected  int handlePIPending()
           
protected  int parseCDataContents()
           
protected  int parseCommentContents()
           
protected  int parsePIData()
           
protected  void reportInvalidInitial(int mask)
           
protected  void reportInvalidOther(int mask)
           
protected  void reportInvalidOther(int mask, int ptr)
           
protected  boolean skipCharacters()
          Method that will be called to skip all possible characters from the input buffer, but without blocking.
protected  boolean skipCoalescedText()
          Coalescing mode is (and will) not be implemented for non-blocking parsers, so this method should never get called.
protected  void skipUtf8_2(int c)
           
protected  int startCharacters(byte b)
          Method called to initialize state for CHARACTERS event, after just a single byte has been seen.
protected  int startCharactersPending()
          This method gets called, if the first character of a CHARACTERS event could not be fully read (multi-byte, split over buffer boundary).
 
Methods inherited from class com.fasterxml.aalto.async.AsyncByteScanner
_closeSource, decodeCharForError, decodeDecEntity, decodeGeneralEntity, decodeHexEntity, endOfInput, feedInput, finishCData, finishComment, finishDTD, finishPI, finishSpace, handleEntityStartingToken, handleNamedEntityStartingToken, handleNumericEntityStartingToken, handlePartialCR, handleStartElement, handleStartElementStart, loadMore, needMoreInput, nextFromProlog, nextFromTree, parseEntityName, parseNewEntityName, parseNewName, parsePName, skipCData, skipComment, skipPI, skipSpace, throwInternal, toString, verifyAndAppendEntityCharacter
 
Methods inherited from class com.fasterxml.aalto.in.ByteBasedScanner
_releaseBuffers, addUtfPName, getCurrentColumnNr, getCurrentLineNr, getCurrentLocation, markLF, markLF
 
Methods inherited from class com.fasterxml.aalto.in.XmlScanner
bindName, bindNs, checkImmutableBinding, close, decodeAttrBinaryValue, decodeAttrValue, decodeAttrValues, decodeElements, findAttrIndex, findOrCreateBinding, finishToken, fireSaxCharacterEvents, fireSaxCommentEvent, fireSaxEndElement, fireSaxPIEvent, fireSaxSpaceEvents, fireSaxStartElement, getAttrCollector, getAttrCount, getAttrLocalName, getAttrNsURI, getAttrPrefix, getAttrPrefixedName, getAttrQName, getAttrType, getAttrValue, getAttrValue, getConfig, getDepth, getDTDPublicId, getDTDSystemId, getEndLocation, getInputPublicId, getInputSystemId, getName, getNamespacePrefix, getNamespaceURI, getNamespaceURI, getNamespaceURI, getNonTransientNamespaceContext, getNsCount, getPrefix, getPrefixes, getQName, getStartLocation, getText, getText, getTextCharacters, getTextCharacters, getTextLength, hasEmptyStack, isAttrSpecified, isEmptyTag, isTextWhitespace, loadMoreGuaranteed, loadMoreGuaranteed, reportDoubleHyphenInComments, reportDuplicateNsDecl, reportEntityOverflow, reportEofInName, reportIllegalCDataEnd, reportIllegalNsDecl, reportIllegalNsDecl, reportInputProblem, reportInvalidNameChar, reportInvalidNsIndex, reportInvalidXmlChar, reportMissingPISpace, reportMultipleColonsInName, reportPrologProblem, reportPrologUnexpChar, reportTreeUnexpChar, reportUnboundPrefix, reportUnexpandedEntityInAttr, reportUnexpectedEndTag, resetForDecoding, skipToken, throwInvalidSpace, throwInvalidXmlChar, throwNullChar, throwUnexpectedChar, verifyXmlChar
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

_inDtdDeclaration

protected boolean _inDtdDeclaration
Flag that indicates whether we are inside a declaration during parsing of internal DTD subset.

Constructor Detail

AsyncUtfScanner

public AsyncUtfScanner(ReaderConfig cfg)
Method Detail

startCharacters

protected final int startCharacters(byte b)
                             throws XMLStreamException
Description copied from class: AsyncByteScanner
Method called to initialize state for CHARACTERS event, after just a single byte has been seen. What needs to be done next depends on whether coalescing mode is set or not: if it is not set, just a single character needs to be decoded, after which current event will be incomplete, but defined as CHARACTERS. In coalescing mode, the whole content must be read before current event can be defined. The reason for difference is that when XMLStreamReader.next() returns, no blocking can occur when calling other methods.

Specified by:
startCharacters in class AsyncByteScanner
Returns:
Event type detected; either CHARACTERS, if at least one full character was decoded (and can be returned), EVENT_INCOMPLETE if not (part of a multi-byte character split across input buffer boundary)
Throws:
XMLStreamException

startCharactersPending

protected int startCharactersPending()
                              throws XMLStreamException
Description copied from class: AsyncByteScanner
This method gets called, if the first character of a CHARACTERS event could not be fully read (multi-byte, split over buffer boundary). If so, there is some pending data to be handled.

Specified by:
startCharactersPending in class AsyncByteScanner
Throws:
XMLStreamException

finishCharacters

protected final void finishCharacters()
                               throws XMLStreamException
This method only gets called in non-coalescing mode; and if so, needs to parse as many characters of the current text segment from the current input block as possible.

Specified by:
finishCharacters in class AsyncByteScanner
Throws:
XMLStreamException

finishCharactersCoalescing

protected final int finishCharactersCoalescing()
                                        throws XMLStreamException
Specified by:
finishCharactersCoalescing in class AsyncByteScanner
Throws:
XMLStreamException

handleEntityInCharacters

protected int handleEntityInCharacters()
                                throws XMLStreamException
Method called to handle entity encountered inside CHARACTERS segment, when trying to complete a non-coalescing text segment.

NOTE: unlike with generic parsing of named entities, where trailing semicolon needs to be left in place, here we should just process it right away.

Returns:
Expanded (character) entity, if positive number; 0 if incomplete.
Throws:
XMLStreamException

handleDecEntityInCharacters

protected int handleDecEntityInCharacters(int ptr)
                                   throws XMLStreamException
Throws:
XMLStreamException

handleHexEntityInCharacters

protected int handleHexEntityInCharacters(int ptr)
                                   throws XMLStreamException
Throws:
XMLStreamException

skipCharacters

protected boolean skipCharacters()
                          throws XMLStreamException
Method that will be called to skip all possible characters from the input buffer, but without blocking. Partial characters are not to be handled (not pending input is to be added).

Specified by:
skipCharacters in class AsyncByteScanner
Returns:
True, if skipping ending with an unexpanded entity; false if not
Throws:
XMLStreamException

skipCoalescedText

protected boolean skipCoalescedText()
                             throws XMLStreamException
Coalescing mode is (and will) not be implemented for non-blocking parsers, so this method should never get called.

Specified by:
skipCoalescedText in class XmlScanner
Returns:
True, if an unexpanded entity was encountered (and is now pending)
Throws:
XMLStreamException

handleAttrValue

protected boolean handleAttrValue()
                           throws XMLStreamException
Specified by:
handleAttrValue in class AsyncByteScanner
Returns:
True, if the whole value was read; false if only part (due to buffer ending)
Throws:
XMLStreamException

handleEntityInAttributeValue

protected int handleEntityInAttributeValue()
                                    throws XMLStreamException
Method called to handle entity encountered inside attribute value.

Returns:
Value of expanded character entity, if processed (which must be 1 or above); 0 for general entity, or -1 for "not enough input"
Throws:
XMLStreamException

handleNsDecl

protected boolean handleNsDecl()
                        throws XMLStreamException
Specified by:
handleNsDecl in class AsyncByteScanner
Throws:
XMLStreamException

handleDTDInternalSubset

protected final boolean handleDTDInternalSubset(boolean init)
                                         throws XMLStreamException
Specified by:
handleDTDInternalSubset in class AsyncByteScanner
Parameters:
init - Whether this is the first call (and state needs to be initialized) or not
Returns:
True if parsing was completed; false if not.
Throws:
XMLStreamException

parseCommentContents

protected final int parseCommentContents()
                                  throws XMLStreamException
Specified by:
parseCommentContents in class AsyncByteScanner
Throws:
XMLStreamException

handleCommentPending

protected final int handleCommentPending()
                                  throws XMLStreamException
Returns:
EVENT_INCOMPLETE, if there's not enough input to handle pending char, COMMENT, if we handled complete "-->" end marker, or 0 to indicate something else was succesfully handled.
Throws:
XMLStreamException

parseCDataContents

protected final int parseCDataContents()
                                throws XMLStreamException
Specified by:
parseCDataContents in class AsyncByteScanner
Throws:
XMLStreamException

handleCDataPending

protected final int handleCDataPending()
                                throws XMLStreamException
Returns:
EVENT_INCOMPLETE, if there's not enough input to handle pending char, CDATA, if we handled complete "]]>" end marker, or 0 to indicate something else was succesfully handled.
Throws:
XMLStreamException

parsePIData

protected final int parsePIData()
                         throws XMLStreamException
Specified by:
parsePIData in class AsyncByteScanner
Throws:
XMLStreamException

handlePIPending

protected final int handlePIPending()
                             throws XMLStreamException
Returns:
EVENT_INCOMPLETE, if there's not enough input to handle pending char, PROCESSING_INSTRUCTION, if we handled complete "?>" end marker, or 0 to indicate something else was succesfully handled.
Throws:
XMLStreamException

decodeUtf8_2

protected final int decodeUtf8_2(int c)
                          throws XMLStreamException

Note: caller must guarantee enough data is available before calling the method

Throws:
XMLStreamException

skipUtf8_2

protected final void skipUtf8_2(int c)
                         throws XMLStreamException
Throws:
XMLStreamException

decodeUtf8_3

protected final int decodeUtf8_3(int c1)
                          throws XMLStreamException

Note: caller must guarantee enough data is available before calling the method

Throws:
XMLStreamException

decodeUtf8_3

protected final int decodeUtf8_3(int c1,
                                 int c2,
                                 int c3)
                          throws XMLStreamException
Throws:
XMLStreamException

decodeUtf8_4

protected final int decodeUtf8_4(int c)
                          throws XMLStreamException
Throws:
XMLStreamException

decodeUtf8_4

protected final int decodeUtf8_4(int c1,
                                 int c2,
                                 int c3,
                                 int c4)
                          throws XMLStreamException
Returns:
Character value minus 0x10000; this so that caller can readily expand it to actual surrogates
Throws:
XMLStreamException

addPName

protected final PName addPName(int hash,
                               int[] quads,
                               int qlen,
                               int lastQuadBytes)
                        throws XMLStreamException
Specified by:
addPName in class AsyncByteScanner
Throws:
XMLStreamException

reportInvalidInitial

protected void reportInvalidInitial(int mask)
                             throws XMLStreamException
Overrides:
reportInvalidInitial in class ByteBasedScanner
Throws:
XMLStreamException

reportInvalidOther

protected void reportInvalidOther(int mask)
                           throws XMLStreamException
Overrides:
reportInvalidOther in class ByteBasedScanner
Throws:
XMLStreamException

reportInvalidOther

protected void reportInvalidOther(int mask,
                                  int ptr)
                           throws XMLStreamException
Throws:
XMLStreamException


Copyright © 2012 Fasterxml.com. All Rights Reserved.