|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object com.reverseXSL.parser.Parser
public final class Parser
Provides the methods to translate input character streams into XML documents.
The parsing is based on DEF files
.
Please refer to the MS-Word documentation 'ReverseXSL DEF file specs.doc' for a complete description of the
Definition objects and file syntaxes handled by this parser.
See also Definition
.
Design Note: Given the simplicity of a parsing
environment (simply comprising a DEF file) we have not associated a ParserFactory to the Parser itself.
One shall simply
instantiate a parser via the constructor:
myParser = new
Parser
(def, maxFatal, maxExceptions);
and then call it as often as desired, repeating in this case the same transformation, each time on a new message, as in:
myParser.parse
(dataIn, ...);
Note that the parse() method in proper returns a count of exceptions. Additional methods are used to inspect
results and get a rendering of the output, only as an XML-formatted document in the present version (additional
output formats
could be added in future releases). A Parser instance is a stateful object, whose state is reset
at the start of any new parse() method call.
The present class provides a fairly low-level API for reverse XSL transformations. Please consider the
TransformerFactory
and
Transformer objects
for improved productivity.
Nested Class Summary | |
---|---|
class |
Parser.ExceptionListIterator
This Inner Class sub-classes a ListIterator such as to support methods more specific to the handling of the Exception list recorded by the parser. |
Constructor Summary | |
---|---|
Parser()
Required but not much useful as such. |
|
Parser(Definition msgDef,
int maxFatEx,
int maxEx)
Initialises a new Parser object with a reference Definition and Exception handling parameters. |
|
Parser(Definition msgDef,
int maxFatEx,
int maxEx,
int maxMisMatch)
Variant of Parser(Definition, int, int) that allows to
set the max number of successive segment/element matching failures after which the
parser will attempt to 'backtrack'. |
Method Summary | |
---|---|
void |
adjustExceptionsLineOffsets(int adjustment)
Adjust the line offsets of all recorded exceptions by adding the given adjustment value to all line offsets (relevant whenever the input message is set of lines). |
Parser.ExceptionListIterator |
exceptionIterator()
Provides a List Iterator on the Array List of recorded exceptions (stored in parser state next to a parse(String, String, int) . |
int |
extractCompositeValue(java.lang.StringBuffer sb,
java.util.regex.Pattern ptrn)
Magic procedure able to return the concatenated value of all capturing groups in a complex pattern applied to a string (the pattern can match once or more times). |
int |
extractCompositeValue(java.lang.StringBuffer sb,
java.util.regex.Pattern ptrn,
java.lang.String sep)
Magic procedure able to return the concatenated value of all capturing groups in a complex pattern applied to a string (the pattern can match once or more times). |
java.io.StringWriter |
getXML(boolean withRAW,
boolean indent)
Provides an XML rendering of the tagged message as resulting from parsing, i.e. |
int |
parse(java.lang.String msgID,
java.io.LineNumberReader dataIn,
int startLineNb)
Parses an input character stream using a LineNumberReader . |
int |
parse(java.lang.String msgID,
java.lang.String dataIn,
int startLineNb)
Variant parse method starting from a string and falling back onto the other parse(String, LineNumberReader, int) method
whenever the system discovers that the cut function
at the Message level is CUT-ON-NL . |
void |
removeNonRepeatableNilOptionalElements(boolean tf)
This method must be called before parsing in itself (i.e. |
void |
setBaseNamespace(java.lang.String bns)
Sets the base XML namespace for every following parse(String, LineNumberReader, int)
invocation followed by getXML(boolean, boolean) . |
java.lang.String |
toString()
Dumps an overview of the parser state into a text string. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public Parser()
This method was made public just for the sake of invoking diverse utility methods notably in RegexCheck and Definition.
Parser(Definition, int, int)
,
Parser(Definition, int, int, int)
public Parser(Definition msgDef, int maxFatEx, int maxEx)
parse(String, LineNumberReader, int)
method can then be repetitively invoked on diverse
input message data.
The maximum number of successive missed-element-matching before backtracking is 3 by default.
See Parser(Definition, int, int, int)
.
msgDef
- the message Definition object to use for parsingmaxFatEx
- the max number of fatal exceptions that will be recorded before being thrownmaxEx
- the max number of all kinds of exceptions (including fatal ones) that will
be recorded before being thrownDefinition
public Parser(Definition msgDef, int maxFatEx, int maxEx, int maxMisMatch)
Parser(Definition, int, int)
that allows to
set the max number of successive segment/element matching failures after which the
parser will attempt to 'backtrack'.
Backtracking means that the Parser will give-up with the current input message element (i.e. skip data and leave it un-tagged as RAW input data), jump back (i.e. 'backtrack) to the last unmatched definition, and attempt to resume parsing from there.
msgDef
- the message Definition object to use for parsingmaxFatEx
- the max number of fatal exceptions that will be recorded before being thrownmaxEx
- the max number of all kinds of exceptions (including fatal ones) that will
be recorded before being thrownmaxMisMatch
- the maximum number of successive missed-element-matching after which the parser
will attempt to resume parsing by skipping input data
and backtracking into the definition.Definition
Method Detail |
---|
public void adjustExceptionsLineOffsets(int adjustment)
This method is a facility to perform the parsing of a input message on the text message body part alone (e.g. without the message's header lines), and then report line offsets of any parsing errors relative to the very beginning of the message, header lines included. If an input interchange contains several messages, this facility helps parsing each message in turn but reports offsets with regard to the global interchange.
Release note: a future release is planned that will de-pollute and normalize well-known EDI formats like EDIFACT and X12 before Parsing, and make segment offsets like line offsets.
adjustment
- public Parser.ExceptionListIterator exceptionIterator()
parse(String, String, int)
.
Note that the last exception that possibly caused the MaxFatal or MaxAllExceptions counts to be exceeded (and thrown) is also recorded.
Parser
state.public int extractCompositeValue(java.lang.StringBuffer sb, java.util.regex.Pattern ptrn)
NOTE: This method was made public just for the sake of being invoked by the RegexCheck tool.
sb
- string buffer containing original string and returned with the extracted resultptrn
- pattern of reference with capturing groups
public int extractCompositeValue(java.lang.StringBuffer sb, java.util.regex.Pattern ptrn, java.lang.String sep)
NOTE: This method was made public just for the sake of being invoked by the RegexCheck tool.
sb
- string buffer containing original string and returned with the extracted resultptrn
- pattern of reference with capturing groupssep
- separator between multiple capturing group values in the concatenated resulting string
public java.io.StringWriter getXML(boolean withRAW, boolean indent) throws javax.xml.parsers.ParserConfigurationException, javax.xml.parsers.FactoryConfigurationError, javax.xml.transform.TransformerFactoryConfigurationError, javax.xml.transform.TransformerException
parse(String, LineNumberReader, int)
method call.
Data and Marks elements whose names start with the special character @ are promoted as attributes of the parent element.
withRAW
- tells to generate RAW element or not;
i.e. either UnTagged elements else those explicitly tagged as 'RAW'indent
- asks for indentation (only line breaks on elements as true indentation does not work!)
StringWriter
javax.xml.parsers.FactoryConfigurationError
javax.xml.parsers.ParserConfigurationException
javax.xml.transform.TransformerFactoryConfigurationError
javax.xml.transform.TransformerException
public int parse(java.lang.String msgID, java.io.LineNumberReader dataIn, int startLineNb) throws java.io.IOException, ParserException
LineNumberReader
. This implementation is able to
trace line offsets in Parser Exceptions
whenever the MSG level cut-function is actually CUT-ON-NL.
The parsing is successful when no exceptions are thrown and the returned number of recorded exceptions is 0.
Next to parsing, the XML document can be generated using:
getXML(boolean, boolean)
msgID
- a message ID (will be recorded in exceptions and traced)dataIn
- the line number reader, possibly reset(), so that readLine() will get the very first charactersstartLineNb
- the line number to assume next to the first dataIn.readLine()
java.io.IOException
ParserException
public int parse(java.lang.String msgID, java.lang.String dataIn, int startLineNb) throws java.io.IOException, ParserException
parse(String, LineNumberReader, int)
method
whenever the system discovers that the cut function
at the Message level is CUT-ON-NL
.
msgID
- a message IDdataIn
- input string data messagestartLineNb
- starting line number (e.g. from the original message that
also possibly contained a header/envelope)
java.io.IOException
ParserException
public void removeNonRepeatableNilOptionalElements(boolean tf)
parse(String, LineNumberReader, int)
)
and would cause (if set TRUE) to remove all data elements with a NIL value
that are optional or conditional elements,
and whose matching definition indicates that the element is non repeatable (i.e. ACC 1),
and whose minimum size requirement is >0.
This function is actually quite useful on messages based on the principle of positional data elements within 'segments' (e.g. EDIFACT, TRADACOMS, X12, etc.). Indeed, most positions (think 'slots') in such segments are occupied by optional/conditional data elements, all unique and distinguished by their relative position in the 'segment'. Every unoccupied position will yield a corresponding NIL data element in XML, that can be suppressed from the XML output if this method is set to TRUE.
NIL data elements are supressed only if they have a min/max size
specification (of the kind [1..15]
) with a minimum of at least 1.
Obviously, if 0 is an acceptable size, there's no reason to suppress the element.
Moreover, the element must be non-repeatable otherwise there is a risk to eat-up first and intermediate elements causing undesirable rank shifts.
The default value is false.
tf
- new value for the flagpublic final void setBaseNamespace(java.lang.String bns)
parse(String, LineNumberReader, int)
invocation followed by getXML(boolean, boolean)
.
This namespace is not reset in between parse(...) calls.
The default namespace is "http://www.reverseXSL.com/FreeParser". Calling this method with null or empty arguments does reset the namespace to the default (as if setBasenamespace() was never invoked). Note that the namespace can be set via the java API alse the SET BASENAMESPACE statement in DEF files. In case both are used, the API takes precedence.
bns
- the namespace that applies to this parser instance, e.g. "http://www.reverseXSL.com/Cargo"public java.lang.String toString()
toString
in class java.lang.Object
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |