|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcom.fasterxml.aalto.in.XmlScanner
com.fasterxml.aalto.in.ByteBasedScanner
com.fasterxml.aalto.async.AsyncByteScanner
public abstract class AsyncByteScanner
This is the base class for asynchronous (non-blocking) XML scanners. Due to basic complexity of async approach, character-based doesn't make much sense, so only byte-based input is supported.
Field Summary | |
---|---|
protected int |
_currQuad
Bytes parsed for the current, incomplete, quad |
protected int |
_currQuadBytes
Number of bytes pending/buffered, stored in _currQuad |
protected boolean |
_elemAllNsBound
|
protected boolean |
_elemAttrCount
|
protected PName |
_elemAttrName
|
protected int |
_elemAttrPtr
Pointer for the next character of currently being parsed value within attribute value buffer |
protected byte |
_elemAttrQuote
|
protected int |
_elemNsPtr
Pointer for the next character of currently being parsed namespace URI for the current namespace declaration |
protected boolean |
_endOfInput
Flag that is sent when calling application indicates that there will be no more input to parse. |
protected int |
_entityValue
Entity value accumulated so far |
protected byte[] |
_inputBuffer
This buffer is actually provided by caller |
protected int |
_nextEvent
Due to asynchronous nature of parsing, we may know what event we are trying to parse, even if it's not yet complete. |
protected int |
_origBufferLen
In addition to current buffer pointer, and end pointer, we will also need to know number of bytes originally contained. |
protected int |
_pendingInput
There are some multi-byte combinations that must be handled as a unit: CR+LF linefeeds, multi-byte UTF-8 characters, and multi-character end markers for comments and PIs. |
protected int |
_quadCount
Number of complete quads parsed for current name (quads themselves are stored in ByteBasedScanner._quadBuffer ). |
protected int |
_state
In addition to the event type, there is need for additional state information |
protected int |
_surroundingEvent
For token/state combinations that are 'shared' between events (or embedded in them), this is where the surrounding event state is retained. |
Fields inherited from class com.fasterxml.aalto.in.ByteBasedScanner |
---|
_charTypes, _inputEnd, _inputPtr, _pastBytes, _quadBuffer, _rowStartOffset, _symbols, _tmpChar, BYTE_a, BYTE_A, BYTE_AMP, BYTE_APOS, BYTE_C, BYTE_CR, BYTE_D, BYTE_EQ, BYTE_EXCL, BYTE_g, BYTE_GT, BYTE_HASH, BYTE_HYPHEN, BYTE_l, BYTE_LBRACKET, BYTE_LF, BYTE_LT, BYTE_m, BYTE_NULL, BYTE_o, BYTE_p, BYTE_P, BYTE_q, BYTE_QMARK, BYTE_QUOT, BYTE_RBRACKET, BYTE_s, BYTE_S, BYTE_SEMICOLON, BYTE_SLASH, BYTE_SPACE, BYTE_t, BYTE_T, BYTE_TAB, BYTE_u, BYTE_x |
Fields inherited from class com.fasterxml.aalto.in.XmlScanner |
---|
_attrCollector, _attrCount, _cfgCoalescing, _cfgLazyParsing, _config, _currElem, _currNsCount, _currRow, _currToken, _defaultNs, _depth, _entityPending, _isEmptyTag, _lastNsContext, _lastNsDecl, _nameBuffer, _nsBindingCache, _nsBindingCount, _nsBindings, _nsBindMisses, _publicId, _systemId, _textBuilder, _tokenIncomplete, _tokenName, _xml11, CDATA_STR, INT_0, INT_9, INT_a, INT_A, INT_AMP, INT_APOS, INT_COLON, INT_CR, INT_EQ, INT_EXCL, INT_f, INT_F, INT_GT, INT_HYPHEN, INT_LBRACKET, INT_LF, INT_LT, INT_NULL, INT_QMARK, INT_QUOTE, INT_RBRACKET, INT_SLASH, INT_SPACE, INT_TAB, INT_z, MAX_UNICODE_CHAR, TOKEN_EOI |
Fields inherited from interface com.fasterxml.aalto.util.XmlConsts |
---|
CHAR_CR, CHAR_LF, CHAR_NULL, CHAR_SPACE, STAX_DEFAULT_OUTPUT_ENCODING, STAX_DEFAULT_OUTPUT_VERSION, XML_DECL_KW_ENCODING, XML_DECL_KW_STANDALONE, XML_DECL_KW_VERSION, XML_SA_NO, XML_SA_YES, XML_V_10, XML_V_10_STR, XML_V_11, XML_V_11_STR, XML_V_UNKNOWN |
Fields inherited from interface javax.xml.stream.XMLStreamConstants |
---|
ATTRIBUTE, CDATA, CHARACTERS, COMMENT, DTD, END_DOCUMENT, END_ELEMENT, ENTITY_DECLARATION, ENTITY_REFERENCE, NAMESPACE, NOTATION_DECLARATION, PROCESSING_INSTRUCTION, SPACE, START_DOCUMENT, START_ELEMENT |
Constructor Summary | |
---|---|
AsyncByteScanner(ReaderConfig cfg)
|
Method Summary | |
---|---|
protected void |
_closeSource()
Since the async scanner has no access to whatever passes content, there is no input source in same sense as with blocking scanner; and there is nothing to close. |
protected abstract PName |
addPName(int hash,
int[] quads,
int qlen,
int lastQuadBytes)
|
protected int |
decodeCharForError(byte b)
Method called by methods when encountering a byte that can not be part of a valid character in the current context. |
protected boolean |
decodeDecEntity()
|
protected int |
decodeGeneralEntity(PName entityName)
Method that verifies that given named entity is followed by a semi-colon (meaning next byte must be available for reading); and if so, whether it is one of pre-defined general entities. |
protected boolean |
decodeHexEntity()
|
void |
endOfInput()
|
void |
feedInput(byte[] buf,
int start,
int len)
|
protected void |
finishCData()
|
protected abstract void |
finishCharacters()
|
protected abstract int |
finishCharactersCoalescing()
|
protected void |
finishComment()
|
protected void |
finishDTD(boolean copyContents)
|
protected void |
finishPI()
|
protected void |
finishSpace()
|
protected abstract boolean |
handleAttrValue()
|
protected abstract boolean |
handleDTDInternalSubset(boolean init)
|
protected int |
handleEntityStartingToken()
Method called when a new token (within tree) starts with an entity. |
protected int |
handleNamedEntityStartingToken()
Method called when we see an entity that is starting a new token, and part of its name has been decoded (but not all) |
protected abstract boolean |
handleNsDecl()
|
protected int |
handleNumericEntityStartingToken()
Method called to handle cases where we find something other than a character entity (or one of 4 pre-defined general entities that act like character entities) |
protected boolean |
handlePartialCR()
Method called when there is a pending \r (from past buffer), and we need to see |
protected int |
handleStartElement()
|
protected int |
handleStartElementStart(byte b)
Method called when '<' and (what appears to be) a name start character have been seen. |
protected boolean |
loadMore()
|
boolean |
needMoreInput()
|
int |
nextFromProlog(boolean isProlog)
|
int |
nextFromTree()
|
protected abstract int |
parseCDataContents()
|
protected abstract int |
parseCommentContents()
|
protected PName |
parseEntityName()
|
protected PName |
parseNewEntityName(byte b)
|
protected PName |
parseNewName(byte b)
|
protected abstract int |
parsePIData()
|
protected PName |
parsePName()
This method can (for now?) be shared between all Ascii-based encodings, since it only does coarse validity checking -- real checks are done in different method. |
protected void |
skipCData()
|
protected abstract boolean |
skipCharacters()
|
protected void |
skipComment()
|
protected void |
skipPI()
|
protected void |
skipSpace()
|
protected abstract int |
startCharacters(byte b)
Method called to initialize state for CHARACTERS event, after just a single byte has been seen. |
protected abstract int |
startCharactersPending()
This method gets called, if the first character of a CHARACTERS event could not be fully read (multi-byte, split over buffer boundary). |
protected int |
throwInternal()
|
String |
toString()
|
protected void |
verifyAndAppendEntityCharacter(int charFromEntity)
Method called to verify validity of given character (from entity) and append it to the text buffer |
Methods inherited from class com.fasterxml.aalto.in.ByteBasedScanner |
---|
_releaseBuffers, addUtfPName, getCurrentColumnNr, getCurrentLineNr, getCurrentLocation, markLF, markLF, reportInvalidInitial, reportInvalidOther |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
protected byte[] _inputBuffer
protected int _origBufferLen
protected int _nextEvent
protected int _state
protected int _surroundingEvent
protected int _pendingInput
If so, this int will store byte(s), in little-endian format (that is, first pending byte is at 0x000000FF, second [if any] at 0x0000FF00, and third at 0x00FF0000). This can be (and is) used to figure out actual number of bytes pending, for multi-byte (UTF-8) character decoding.
Note: it is assumed that if value is 0, there is no data. Thus, if 0 needed to be added pending, it has to be masked.
protected boolean _endOfInput
protected int _quadCount
ByteBasedScanner._quadBuffer
).
protected int _currQuad
protected int _currQuadBytes
_currQuad
protected int _entityValue
protected boolean _elemAllNsBound
protected boolean _elemAttrCount
protected byte _elemAttrQuote
protected PName _elemAttrName
protected int _elemAttrPtr
protected int _elemNsPtr
Constructor Detail |
---|
public AsyncByteScanner(ReaderConfig cfg)
Method Detail |
---|
public String toString()
toString
in class Object
protected abstract int parseCommentContents() throws XMLStreamException
XMLStreamException
protected abstract int parseCDataContents() throws XMLStreamException
XMLStreamException
protected abstract int parsePIData() throws XMLStreamException
XMLStreamException
protected abstract int startCharactersPending() throws XMLStreamException
XMLStreamException
protected abstract int finishCharactersCoalescing() throws XMLStreamException
XMLStreamException
public final boolean needMoreInput()
public void feedInput(byte[] buf, int start, int len) throws XMLStreamException
XMLStreamException
public void endOfInput()
protected void _closeSource() throws IOException
_closeSource
in class ByteBasedScanner
IOException
public final int nextFromProlog(boolean isProlog) throws XMLStreamException
nextFromProlog
in class XmlScanner
XMLStreamException
public int nextFromTree() throws XMLStreamException
nextFromTree
in class XmlScanner
XMLStreamException
protected abstract boolean handleDTDInternalSubset(boolean init) throws XMLStreamException
init
- Whether this is the first call (and state needs to be initialized) or not
XMLStreamException
protected abstract int startCharacters(byte b) throws XMLStreamException
XMLStreamReader.next()
returns, no
blocking can occur when calling other methods.
XMLStreamException
protected int handleEntityStartingToken() throws XMLStreamException
XMLStreamException
protected int handleNamedEntityStartingToken() throws XMLStreamException
XMLStreamException
protected int handleNumericEntityStartingToken() throws XMLStreamException
XMLStreamException
protected final boolean decodeHexEntity() throws XMLStreamException
_entityValue
;
false otherwise
XMLStreamException
protected final boolean decodeDecEntity() throws XMLStreamException
_entityValue
;
false otherwise
XMLStreamException
protected final int decodeGeneralEntity(PName entityName) throws XMLStreamException
XMLStreamException
protected int handleStartElementStart(byte b) throws XMLStreamException
XMLStreamException
protected int handleStartElement() throws XMLStreamException
XMLStreamException
protected abstract boolean handleAttrValue() throws XMLStreamException
XMLStreamException
protected abstract boolean handleNsDecl() throws XMLStreamException
XMLStreamException
protected abstract void finishCharacters() throws XMLStreamException
finishCharacters
in class XmlScanner
XMLStreamException
protected void finishCData() throws XMLStreamException
finishCData
in class XmlScanner
XMLStreamException
protected void finishComment() throws XMLStreamException
finishComment
in class XmlScanner
XMLStreamException
protected void finishDTD(boolean copyContents) throws XMLStreamException
finishDTD
in class XmlScanner
XMLStreamException
protected void finishPI() throws XMLStreamException
finishPI
in class XmlScanner
XMLStreamException
protected void finishSpace() throws XMLStreamException
finishSpace
in class XmlScanner
XMLStreamException
protected abstract boolean skipCharacters() throws XMLStreamException
skipCharacters
in class XmlScanner
XMLStreamException
protected void skipCData() throws XMLStreamException
skipCData
in class XmlScanner
XMLStreamException
protected void skipComment() throws XMLStreamException
skipComment
in class XmlScanner
XMLStreamException
protected void skipPI() throws XMLStreamException
skipPI
in class XmlScanner
XMLStreamException
protected void skipSpace() throws XMLStreamException
skipSpace
in class XmlScanner
XMLStreamException
protected boolean loadMore() throws XMLStreamException
loadMore
in class XmlScanner
XMLStreamException
protected PName parseNewName(byte b) throws XMLStreamException
XMLStreamException
protected PName parsePName() throws XMLStreamException
Some notes about assumption implementation makes:
XMLStreamException
protected final PName parseNewEntityName(byte b) throws XMLStreamException
XMLStreamException
protected final PName parseEntityName() throws XMLStreamException
XMLStreamException
protected abstract PName addPName(int hash, int[] quads, int qlen, int lastQuadBytes) throws XMLStreamException
addPName
in class ByteBasedScanner
XMLStreamException
protected void verifyAndAppendEntityCharacter(int charFromEntity) throws XMLStreamException
XMLStreamException
protected final boolean handlePartialCR()
protected int decodeCharForError(byte b) throws XMLStreamException
ByteBasedScanner
decodeCharForError
in class ByteBasedScanner
XMLStreamException
protected int throwInternal()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |