com.fasterxml.aalto.in
Class StreamScanner

java.lang.Object
  extended by com.fasterxml.aalto.in.XmlScanner
      extended by com.fasterxml.aalto.in.ByteBasedScanner
          extended by com.fasterxml.aalto.in.StreamScanner
All Implemented Interfaces:
XmlConsts, NamespaceContext, XMLStreamConstants
Direct Known Subclasses:
Utf8Scanner

public abstract class StreamScanner
extends ByteBasedScanner

Base class for various byte stream based scanners (generally one for each type of encoding supported).


Field Summary
protected  InputStream _in
          Underlying InputStream to use for reading content.
protected  byte[] _inputBuffer
           
protected  int _inputEnd
           
protected  int _inputPtr
           
 
Fields inherited from class com.fasterxml.aalto.in.ByteBasedScanner
_charTypes, _pastBytes, _quadBuffer, _rowStartOffset, _symbols, _tmpChar, BYTE_a, BYTE_A, BYTE_AMP, BYTE_APOS, BYTE_C, BYTE_CR, BYTE_D, BYTE_EQ, BYTE_EXCL, BYTE_g, BYTE_GT, BYTE_HASH, BYTE_HYPHEN, BYTE_l, BYTE_LBRACKET, BYTE_LF, BYTE_LT, BYTE_m, BYTE_NULL, BYTE_o, BYTE_p, BYTE_P, BYTE_q, BYTE_QMARK, BYTE_QUOT, BYTE_RBRACKET, BYTE_s, BYTE_S, BYTE_SEMICOLON, BYTE_SLASH, BYTE_SPACE, BYTE_t, BYTE_T, BYTE_TAB, BYTE_u, BYTE_x
 
Fields inherited from class com.fasterxml.aalto.in.XmlScanner
_attrCollector, _attrCount, _cfgCoalescing, _cfgLazyParsing, _config, _currElem, _currNsCount, _currRow, _currToken, _defaultNs, _depth, _entityPending, _isEmptyTag, _lastNsContext, _lastNsDecl, _nameBuffer, _nsBindingCache, _nsBindingCount, _nsBindings, _nsBindMisses, _publicId, _systemId, _textBuilder, _tokenIncomplete, _tokenName, _xml11, CDATA_STR, INT_0, INT_9, INT_a, INT_A, INT_AMP, INT_APOS, INT_COLON, INT_CR, INT_EQ, INT_EXCL, INT_f, INT_F, INT_GT, INT_HYPHEN, INT_LBRACKET, INT_LF, INT_LT, INT_NULL, INT_QMARK, INT_QUOTE, INT_RBRACKET, INT_SLASH, INT_SPACE, INT_TAB, INT_z, MAX_UNICODE_CHAR, TOKEN_EOI
 
Fields inherited from interface com.fasterxml.aalto.util.XmlConsts
CHAR_CR, CHAR_LF, CHAR_NULL, CHAR_SPACE, STAX_DEFAULT_OUTPUT_ENCODING, STAX_DEFAULT_OUTPUT_VERSION, XML_DECL_KW_ENCODING, XML_DECL_KW_STANDALONE, XML_DECL_KW_VERSION, XML_SA_NO, XML_SA_YES, XML_V_10, XML_V_10_STR, XML_V_11, XML_V_11_STR, XML_V_UNKNOWN
 
Fields inherited from interface javax.xml.stream.XMLStreamConstants
ATTRIBUTE, CDATA, CHARACTERS, COMMENT, DTD, END_DOCUMENT, END_ELEMENT, ENTITY_DECLARATION, ENTITY_REFERENCE, NAMESPACE, NOTATION_DECLARATION, PROCESSING_INSTRUCTION, SPACE, START_DOCUMENT, START_ELEMENT
 
Constructor Summary
StreamScanner(ReaderConfig cfg, InputStream in, byte[] buffer, int ptr, int last)
           
 
Method Summary
protected  void _closeSource()
           
protected  void _releaseBuffers()
           
protected  int checkInTreeIndentation(int c)
           Note: consequtive white space is only considered indentation, if the following token seems like a tag (start/end).
protected  int checkPrologIndentation(int c)
           
protected  int handleCharEntity()
           
protected  int handleEndElement()
          Note that this method is currently also shareable for all Ascii-based encodings, and at least between UTF-8 and ISO-Latin1.
protected abstract  int handleEntityInText(boolean inAttr)
           
protected abstract  int handleStartElement(byte b)
          Parsing of start element requires parsing of the element name (and attribute names), and is thus encoding-specific.
protected  boolean loadAndRetain(int nrOfChars)
           
protected  boolean loadMore()
           
protected  byte loadOne()
           
protected  byte loadOne(int type)
           
protected  byte nextByte()
           
protected  byte nextByte(int tt)
           
 int nextFromProlog(boolean isProlog)
           
 int nextFromTree()
           
protected  PName parsePName(byte b)
          This method can (for now?) be shared between all Ascii-based encodings, since it only does coarse validity checking -- real checks are done in different method.
protected  PName parsePNameLong(int q, int[] quads)
           
protected  PName parsePNameMedium(int i2, int q1)
           
protected  PName parsePNameSlow(byte b)
           
protected abstract  String parsePublicId(byte quoteChar)
           
protected abstract  String parseSystemId(byte quoteChar)
           
protected  byte skipInternalWs(boolean reqd, String msg)
           
 
Methods inherited from class com.fasterxml.aalto.in.ByteBasedScanner
addPName, addUtfPName, decodeCharForError, getCurrentColumnNr, getCurrentLineNr, getCurrentLocation, markLF, markLF, reportInvalidInitial, reportInvalidOther
 
Methods inherited from class com.fasterxml.aalto.in.XmlScanner
bindName, bindNs, checkImmutableBinding, close, decodeAttrBinaryValue, decodeAttrValue, decodeAttrValues, decodeElements, findAttrIndex, findOrCreateBinding, finishCData, finishCharacters, finishComment, finishDTD, finishPI, finishSpace, finishToken, fireSaxCharacterEvents, fireSaxCommentEvent, fireSaxEndElement, fireSaxPIEvent, fireSaxSpaceEvents, fireSaxStartElement, getAttrCollector, getAttrCount, getAttrLocalName, getAttrNsURI, getAttrPrefix, getAttrPrefixedName, getAttrQName, getAttrType, getAttrValue, getAttrValue, getConfig, getDepth, getDTDPublicId, getDTDSystemId, getEndLocation, getInputPublicId, getInputSystemId, getName, getNamespacePrefix, getNamespaceURI, getNamespaceURI, getNamespaceURI, getNonTransientNamespaceContext, getNsCount, getPrefix, getPrefixes, getQName, getStartLocation, getText, getText, getTextCharacters, getTextCharacters, getTextLength, hasEmptyStack, isAttrSpecified, isEmptyTag, isTextWhitespace, loadMoreGuaranteed, loadMoreGuaranteed, reportDoubleHyphenInComments, reportDuplicateNsDecl, reportEntityOverflow, reportEofInName, reportIllegalCDataEnd, reportIllegalNsDecl, reportIllegalNsDecl, reportInputProblem, reportInvalidNameChar, reportInvalidNsIndex, reportInvalidXmlChar, reportMissingPISpace, reportMultipleColonsInName, reportPrologProblem, reportPrologUnexpChar, reportTreeUnexpChar, reportUnboundPrefix, reportUnexpandedEntityInAttr, reportUnexpectedEndTag, resetForDecoding, skipCData, skipCharacters, skipCoalescedText, skipComment, skipPI, skipSpace, skipToken, throwInvalidSpace, throwInvalidXmlChar, throwNullChar, throwUnexpectedChar, verifyXmlChar
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

_in

protected InputStream _in
Underlying InputStream to use for reading content.


_inputBuffer

protected byte[] _inputBuffer

_inputPtr

protected int _inputPtr

_inputEnd

protected int _inputEnd
Constructor Detail

StreamScanner

public StreamScanner(ReaderConfig cfg,
                     InputStream in,
                     byte[] buffer,
                     int ptr,
                     int last)
Method Detail

_releaseBuffers

protected void _releaseBuffers()
Overrides:
_releaseBuffers in class ByteBasedScanner

_closeSource

protected void _closeSource()
                     throws IOException
Specified by:
_closeSource in class ByteBasedScanner
Throws:
IOException

handleEntityInText

protected abstract int handleEntityInText(boolean inAttr)
                                   throws XMLStreamException
Throws:
XMLStreamException

parsePublicId

protected abstract String parsePublicId(byte quoteChar)
                                 throws XMLStreamException
Throws:
XMLStreamException

parseSystemId

protected abstract String parseSystemId(byte quoteChar)
                                 throws XMLStreamException
Throws:
XMLStreamException

nextFromProlog

public final int nextFromProlog(boolean isProlog)
                         throws XMLStreamException
Specified by:
nextFromProlog in class XmlScanner
Throws:
XMLStreamException

nextFromTree

public final int nextFromTree()
                       throws XMLStreamException
Specified by:
nextFromTree in class XmlScanner
Throws:
XMLStreamException

handleCharEntity

protected final int handleCharEntity()
                              throws XMLStreamException
Returns:
Code point for the entity that expands to a valid XML content character.
Throws:
XMLStreamException

handleStartElement

protected abstract int handleStartElement(byte b)
                                   throws XMLStreamException
Parsing of start element requires parsing of the element name (and attribute names), and is thus encoding-specific.

Throws:
XMLStreamException

handleEndElement

protected final int handleEndElement()
                              throws XMLStreamException
Note that this method is currently also shareable for all Ascii-based encodings, and at least between UTF-8 and ISO-Latin1. The reason is that since we already know exact bytes that need to be matched, there's no danger of getting invalid encodings or such. So, for now, let's leave this method here in the base class.

Throws:
XMLStreamException

parsePName

protected final PName parsePName(byte b)
                          throws XMLStreamException
This method can (for now?) be shared between all Ascii-based encodings, since it only does coarse validity checking -- real checks are done in different method.

Some notes about assumption implementation makes:

Throws:
XMLStreamException

parsePNameMedium

protected PName parsePNameMedium(int i2,
                                 int q1)
                          throws XMLStreamException
Throws:
XMLStreamException

parsePNameLong

protected final PName parsePNameLong(int q,
                                     int[] quads)
                              throws XMLStreamException
Throws:
XMLStreamException

parsePNameSlow

protected final PName parsePNameSlow(byte b)
                              throws XMLStreamException
Throws:
XMLStreamException

skipInternalWs

protected byte skipInternalWs(boolean reqd,
                              String msg)
                       throws XMLStreamException
Returns:
First byte following skipped white space
Throws:
XMLStreamException

checkInTreeIndentation

protected final int checkInTreeIndentation(int c)
                                    throws XMLStreamException

Note: consequtive white space is only considered indentation, if the following token seems like a tag (start/end). This so that if a CDATA section follows, it can be coalesced in coalescing mode. Although we could check if coalescing mode is enabled, this should seldom have significant effect either way, so it removes one possible source of problems in coalescing mode.

Returns:
-1, if indentation was handled; offset in the output buffer, if not
Throws:
XMLStreamException

checkPrologIndentation

protected final int checkPrologIndentation(int c)
                                    throws XMLStreamException
Returns:
-1, if indentation was handled; offset in the output buffer, if not
Throws:
XMLStreamException

loadMore

protected final boolean loadMore()
                          throws XMLStreamException
Specified by:
loadMore in class XmlScanner
Throws:
XMLStreamException

nextByte

protected final byte nextByte(int tt)
                       throws XMLStreamException
Throws:
XMLStreamException

nextByte

protected final byte nextByte()
                       throws XMLStreamException
Throws:
XMLStreamException

loadOne

protected final byte loadOne()
                      throws XMLStreamException
Throws:
XMLStreamException

loadOne

protected final byte loadOne(int type)
                      throws XMLStreamException
Throws:
XMLStreamException

loadAndRetain

protected final boolean loadAndRetain(int nrOfChars)
                               throws XMLStreamException
Throws:
XMLStreamException


Copyright © 2012 Fasterxml.com. All Rights Reserved.