com.fasterxml.aalto.in
Class ByteBasedScanner

java.lang.Object
  extended by com.fasterxml.aalto.in.XmlScanner
      extended by com.fasterxml.aalto.in.ByteBasedScanner
All Implemented Interfaces:
XmlConsts, NamespaceContext, XMLStreamConstants
Direct Known Subclasses:
AsyncByteScanner, StreamScanner

public abstract class ByteBasedScanner
extends XmlScanner

Intermediate base class used by different byte-backed scanners. Specifically, used as a base by both blocking (stream) and non-blocking (async) byte-based scanners (as opposed to Writer-backed, ie. character-based scanners)


Field Summary
protected  XmlCharTypes _charTypes
          This is a simple container object that is used to access the decoding tables for characters.
protected  int _inputEnd
          Pointer to the first byte after the end of valid content.
protected  int _inputPtr
          Pointer to the next unread byte in the input buffer.
protected  int _pastBytes
          Number of bytes that were read and processed before the contents of the current buffer; used for calculating absolute offsets.
protected  int[] _quadBuffer
          This buffer is used for name parsing.
protected  int _rowStartOffset
          Offset used to calculate the column value given current input buffer pointer.
protected  ByteBasedPNameTable _symbols
          For now, symbol table contains prefixed names.
protected  int _tmpChar
          Storage location for a single character that can not be easily pushed back (for example, multi-byte char; or char entity expansion).
protected static byte BYTE_a
           
protected static byte BYTE_A
           
protected static byte BYTE_AMP
           
protected static byte BYTE_APOS
           
protected static byte BYTE_C
           
protected static byte BYTE_CR
           
protected static byte BYTE_D
           
protected static byte BYTE_EQ
           
protected static byte BYTE_EXCL
           
protected static byte BYTE_g
           
protected static byte BYTE_GT
           
protected static byte BYTE_HASH
           
protected static byte BYTE_HYPHEN
           
protected static byte BYTE_l
           
protected static byte BYTE_LBRACKET
           
protected static byte BYTE_LF
           
protected static byte BYTE_LT
           
protected static byte BYTE_m
           
protected static byte BYTE_NULL
           
protected static byte BYTE_o
           
protected static byte BYTE_p
           
protected static byte BYTE_P
           
protected static byte BYTE_q
           
protected static byte BYTE_QMARK
           
protected static byte BYTE_QUOT
           
protected static byte BYTE_RBRACKET
           
protected static byte BYTE_s
           
protected static byte BYTE_S
           
protected static byte BYTE_SEMICOLON
           
protected static byte BYTE_SLASH
           
protected static byte BYTE_SPACE
           
protected static byte BYTE_t
           
protected static byte BYTE_T
           
protected static byte BYTE_TAB
           
protected static byte BYTE_u
           
protected static byte BYTE_x
           
 
Fields inherited from class com.fasterxml.aalto.in.XmlScanner
_attrCollector, _attrCount, _cfgCoalescing, _cfgLazyParsing, _config, _currElem, _currNsCount, _currRow, _currToken, _defaultNs, _depth, _entityPending, _isEmptyTag, _lastNsContext, _lastNsDecl, _nameBuffer, _nsBindingCache, _nsBindingCount, _nsBindings, _nsBindMisses, _publicId, _systemId, _textBuilder, _tokenIncomplete, _tokenName, _xml11, CDATA_STR, INT_0, INT_9, INT_a, INT_A, INT_AMP, INT_APOS, INT_COLON, INT_CR, INT_EQ, INT_EXCL, INT_f, INT_F, INT_GT, INT_HYPHEN, INT_LBRACKET, INT_LF, INT_LT, INT_NULL, INT_QMARK, INT_QUOTE, INT_RBRACKET, INT_SLASH, INT_SPACE, INT_TAB, INT_z, MAX_UNICODE_CHAR, TOKEN_EOI
 
Fields inherited from interface com.fasterxml.aalto.util.XmlConsts
CHAR_CR, CHAR_LF, CHAR_NULL, CHAR_SPACE, STAX_DEFAULT_OUTPUT_ENCODING, STAX_DEFAULT_OUTPUT_VERSION, XML_DECL_KW_ENCODING, XML_DECL_KW_STANDALONE, XML_DECL_KW_VERSION, XML_SA_NO, XML_SA_YES, XML_V_10, XML_V_10_STR, XML_V_11, XML_V_11_STR, XML_V_UNKNOWN
 
Fields inherited from interface javax.xml.stream.XMLStreamConstants
ATTRIBUTE, CDATA, CHARACTERS, COMMENT, DTD, END_DOCUMENT, END_ELEMENT, ENTITY_DECLARATION, ENTITY_REFERENCE, NAMESPACE, NOTATION_DECLARATION, PROCESSING_INSTRUCTION, SPACE, START_DOCUMENT, START_ELEMENT
 
Constructor Summary
protected ByteBasedScanner(ReaderConfig cfg)
           
 
Method Summary
protected abstract  void _closeSource()
           
protected  void _releaseBuffers()
           
protected abstract  PName addPName(int hash, int[] quads, int qlen, int lastQuadBytes)
           
protected  PName addUtfPName(XmlCharTypes charTypes, int hash, int[] quads, int qlen, int lastQuadBytes)
          Conceptually, this method really does NOT belong here.
protected abstract  int decodeCharForError(byte b)
          Method called by methods when encountering a byte that can not be part of a valid character in the current context.
 int getCurrentColumnNr()
           
 int getCurrentLineNr()
           
 org.codehaus.stax2.XMLStreamLocation2 getCurrentLocation()
           
protected  void markLF()
           
protected  void markLF(int offset)
           
protected  void reportInvalidInitial(int mask)
           
protected  void reportInvalidOther(int mask)
           
 
Methods inherited from class com.fasterxml.aalto.in.XmlScanner
bindName, bindNs, checkImmutableBinding, close, decodeAttrBinaryValue, decodeAttrValue, decodeAttrValues, decodeElements, findAttrIndex, findOrCreateBinding, finishCData, finishCharacters, finishComment, finishDTD, finishPI, finishSpace, finishToken, fireSaxCharacterEvents, fireSaxCommentEvent, fireSaxEndElement, fireSaxPIEvent, fireSaxSpaceEvents, fireSaxStartElement, getAttrCollector, getAttrCount, getAttrLocalName, getAttrNsURI, getAttrPrefix, getAttrPrefixedName, getAttrQName, getAttrType, getAttrValue, getAttrValue, getConfig, getDepth, getDTDPublicId, getDTDSystemId, getEndLocation, getInputPublicId, getInputSystemId, getName, getNamespacePrefix, getNamespaceURI, getNamespaceURI, getNamespaceURI, getNonTransientNamespaceContext, getNsCount, getPrefix, getPrefixes, getQName, getStartLocation, getText, getText, getTextCharacters, getTextCharacters, getTextLength, hasEmptyStack, isAttrSpecified, isEmptyTag, isTextWhitespace, loadMore, loadMoreGuaranteed, loadMoreGuaranteed, nextFromProlog, nextFromTree, reportDoubleHyphenInComments, reportDuplicateNsDecl, reportEntityOverflow, reportEofInName, reportIllegalCDataEnd, reportIllegalNsDecl, reportIllegalNsDecl, reportInputProblem, reportInvalidNameChar, reportInvalidNsIndex, reportInvalidXmlChar, reportMissingPISpace, reportMultipleColonsInName, reportPrologProblem, reportPrologUnexpChar, reportTreeUnexpChar, reportUnboundPrefix, reportUnexpandedEntityInAttr, reportUnexpectedEndTag, resetForDecoding, skipCData, skipCharacters, skipCoalescedText, skipComment, skipPI, skipSpace, skipToken, throwInvalidSpace, throwInvalidXmlChar, throwNullChar, throwUnexpectedChar, verifyXmlChar
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

BYTE_NULL

protected static final byte BYTE_NULL
See Also:
Constant Field Values

BYTE_SPACE

protected static final byte BYTE_SPACE
See Also:
Constant Field Values

BYTE_LF

protected static final byte BYTE_LF
See Also:
Constant Field Values

BYTE_CR

protected static final byte BYTE_CR
See Also:
Constant Field Values

BYTE_TAB

protected static final byte BYTE_TAB
See Also:
Constant Field Values

BYTE_LT

protected static final byte BYTE_LT
See Also:
Constant Field Values

BYTE_GT

protected static final byte BYTE_GT
See Also:
Constant Field Values

BYTE_AMP

protected static final byte BYTE_AMP
See Also:
Constant Field Values

BYTE_HASH

protected static final byte BYTE_HASH
See Also:
Constant Field Values

BYTE_EXCL

protected static final byte BYTE_EXCL
See Also:
Constant Field Values

BYTE_HYPHEN

protected static final byte BYTE_HYPHEN
See Also:
Constant Field Values

BYTE_QMARK

protected static final byte BYTE_QMARK
See Also:
Constant Field Values

BYTE_SLASH

protected static final byte BYTE_SLASH
See Also:
Constant Field Values

BYTE_EQ

protected static final byte BYTE_EQ
See Also:
Constant Field Values

BYTE_QUOT

protected static final byte BYTE_QUOT
See Also:
Constant Field Values

BYTE_APOS

protected static final byte BYTE_APOS
See Also:
Constant Field Values

BYTE_LBRACKET

protected static final byte BYTE_LBRACKET
See Also:
Constant Field Values

BYTE_RBRACKET

protected static final byte BYTE_RBRACKET
See Also:
Constant Field Values

BYTE_SEMICOLON

protected static final byte BYTE_SEMICOLON
See Also:
Constant Field Values

BYTE_a

protected static final byte BYTE_a
See Also:
Constant Field Values

BYTE_g

protected static final byte BYTE_g
See Also:
Constant Field Values

BYTE_l

protected static final byte BYTE_l
See Also:
Constant Field Values

BYTE_m

protected static final byte BYTE_m
See Also:
Constant Field Values

BYTE_o

protected static final byte BYTE_o
See Also:
Constant Field Values

BYTE_p

protected static final byte BYTE_p
See Also:
Constant Field Values

BYTE_q

protected static final byte BYTE_q
See Also:
Constant Field Values

BYTE_s

protected static final byte BYTE_s
See Also:
Constant Field Values

BYTE_t

protected static final byte BYTE_t
See Also:
Constant Field Values

BYTE_u

protected static final byte BYTE_u
See Also:
Constant Field Values

BYTE_x

protected static final byte BYTE_x
See Also:
Constant Field Values

BYTE_A

protected static final byte BYTE_A
See Also:
Constant Field Values

BYTE_C

protected static final byte BYTE_C
See Also:
Constant Field Values

BYTE_D

protected static final byte BYTE_D
See Also:
Constant Field Values

BYTE_P

protected static final byte BYTE_P
See Also:
Constant Field Values

BYTE_S

protected static final byte BYTE_S
See Also:
Constant Field Values

BYTE_T

protected static final byte BYTE_T
See Also:
Constant Field Values

_inputPtr

protected int _inputPtr
Pointer to the next unread byte in the input buffer.


_inputEnd

protected int _inputEnd
Pointer to the first byte after the end of valid content. This may point beyond of the physical buffer array.


_quadBuffer

protected int[] _quadBuffer
This buffer is used for name parsing. Will be expanded if/as needed; 32 ints can hold names 128 ascii chars long.


_symbols

protected final ByteBasedPNameTable _symbols
For now, symbol table contains prefixed names. In future it is possible that they may be split into prefixes and local names?


_charTypes

protected final XmlCharTypes _charTypes
This is a simple container object that is used to access the decoding tables for characters. Indirection is needed since we actually support multiple utf-8 compatible encodings, not just utf-8 itself.


_pastBytes

protected int _pastBytes
Number of bytes that were read and processed before the contents of the current buffer; used for calculating absolute offsets.


_rowStartOffset

protected int _rowStartOffset
Offset used to calculate the column value given current input buffer pointer. May be negative, if the first character of the row was contained within an earlier buffer.


_tmpChar

protected int _tmpChar
Storage location for a single character that can not be easily pushed back (for example, multi-byte char; or char entity expansion). Negative, if from entity expansion; positive if a singular char.

Constructor Detail

ByteBasedScanner

protected ByteBasedScanner(ReaderConfig cfg)
Method Detail

_releaseBuffers

protected void _releaseBuffers()
Overrides:
_releaseBuffers in class XmlScanner

_closeSource

protected abstract void _closeSource()
                              throws IOException
Specified by:
_closeSource in class XmlScanner
Throws:
IOException

getCurrentLocation

public org.codehaus.stax2.XMLStreamLocation2 getCurrentLocation()
Specified by:
getCurrentLocation in class XmlScanner
Returns:
Current input location

getCurrentLineNr

public int getCurrentLineNr()
Specified by:
getCurrentLineNr in class XmlScanner

getCurrentColumnNr

public int getCurrentColumnNr()
Specified by:
getCurrentColumnNr in class XmlScanner

markLF

protected final void markLF(int offset)

markLF

protected final void markLF()

decodeCharForError

protected abstract int decodeCharForError(byte b)
                                   throws XMLStreamException
Method called by methods when encountering a byte that can not be part of a valid character in the current context. Should return the actual decoded character for error reporting purposes.

Throws:
XMLStreamException

addPName

protected abstract PName addPName(int hash,
                                  int[] quads,
                                  int qlen,
                                  int lastQuadBytes)
                           throws XMLStreamException
Throws:
XMLStreamException

addUtfPName

protected final PName addUtfPName(XmlCharTypes charTypes,
                                  int hash,
                                  int[] quads,
                                  int qlen,
                                  int lastQuadBytes)
                           throws XMLStreamException
Conceptually, this method really does NOT belong here. However, currently it is quite hard to refactor it, so it'll have to stay here until better place is found

Throws:
XMLStreamException

reportInvalidInitial

protected void reportInvalidInitial(int mask)
                             throws XMLStreamException
Throws:
XMLStreamException

reportInvalidOther

protected void reportInvalidOther(int mask)
                           throws XMLStreamException
Throws:
XMLStreamException


Copyright © 2012 Fasterxml.com. All Rights Reserved.