page "pyPEG – the Parser Engine", "counter-reset: chapter 2;" {
|
|
h1 id=pengine> Parser Engine
|
|
|
|
h2 id=parser > Class Parser
|
|
|
|
p >>
|
|
Offers parsing and composing capabilities. Implements an intrinsic
|
|
∫Packrat parser∫.
|
|
>>
|
|
|
|
p >>
|
|
ƒpyPEG uses memoization as speed enhancement. Create a
|
|
`a href="#parser" code > Parser` instance to have a reset cache memory.
|
|
Usually this is recommended if you're parsing another text – the cache
|
|
memory will not provide wrong results but a reset will save memory
|
|
consumption. If you're altering the grammar then clearing the cache
|
|
memory for the respective things is required for having correct parsing
|
|
results. Please use the
|
|
`a href="#parser_clear_memory" code > clear_memory()` method in that
|
|
case.
|
|
>>
|
|
|
|
h3 id=parser_vars > Instance variables
|
|
|
|
p >>
|
|
The instance variables are representing the parser's state.
|
|
>>
|
|
|
|
glossary {
|
|
term "whitespace"
|
|
>>
|
|
Regular expression to scan whitespace; default: «re.compile(r"(?m)\s+")».
|
|
Set to «None» to disable automatic «whitespace» removing.
|
|
>>
|
|
term "comment"
|
|
>>
|
|
«grammar» to parse comments; default: «None».
|
|
If a «grammar» is set here, comments will be removed from the
|
|
source text automatically.
|
|
>>
|
|
term "last_error"
|
|
> after parsing, «SyntaxError» which ended parsing
|
|
term "indent"
|
|
> string to use to indent while composing; default: four spaces
|
|
term "indention_level"
|
|
> level to indent to; default: «0»
|
|
term "text"
|
|
> original text to parse; set for decorated syntax errors
|
|
term "filename"
|
|
> filename where text is origin from
|
|
term "autoblank"
|
|
> add blanks while composing if grammar would possibly be violated otherwise; default: True
|
|
term "keep_feeble_things"
|
|
>>
|
|
keep otherwise cropped things like comments and whitespace; these
|
|
things are being put into the «feeble_things» attribute
|
|
>>
|
|
}
|
|
|
|
h3 id=parser_init > Method __init__()
|
|
|
|
h4 > Synopsis
|
|
p > «__init__(self)»
|
|
|
|
p > Initialize instance variables to their defaults.
|
|
|
|
h3 id=parser_clear_memory > Method clear_memory()
|
|
|
|
h4 > Synopsis
|
|
p > «clear_memory(self, thing=None)»
|
|
|
|
p > Clear cache memory for packrat parsing.
|
|
|
|
p >>
|
|
This method clears the cache memory for «thing». If «None» is given
|
|
as «thing», it clears the cache completely.
|
|
>>
|
|
|
|
h4 > Arguments
|
|
|
|
glossary {
|
|
term "thing" > thing for which cache memory is cleared; default: «None»
|
|
}
|
|
|
|
h3 id=parser_parse > Method parse()
|
|
|
|
h4 > Synopsis
|
|
p > «parse(self, text, thing, filename=None)»
|
|
|
|
p >>
|
|
(Partially) parse «text» following «thing» as grammar and return the
|
|
resulting things.
|
|
>>
|
|
|
|
p >>
|
|
This method parses as far as possible. It does not raise a
|
|
«SyntaxError» if the source «text» does not parse completely. It
|
|
returns a «SyntaxError» object as «result» part of the return value if
|
|
the beginning of the source «text» does not comply with grammar
|
|
«thing».
|
|
>>
|
|
|
|
h4 > Arguments
|
|
|
|
glossary {
|
|
term "text" > text to parse
|
|
term "thing" > grammar for things to parse
|
|
term "filename" > filename where text is origin from
|
|
}
|
|
|
|
h4 > Returns
|
|
|
|
p > Returns «(text, result)» with:
|
|
|
|
glossary {
|
|
term "text" > unparsed text
|
|
term "result" > generated objects
|
|
}
|
|
|
|
h4 > Raises
|
|
|
|
glossary {
|
|
term "ValueError"
|
|
> if input does not match types
|
|
term "TypeError"
|
|
> if output classes have wrong syntax for their respective «__init__(self, ...)»
|
|
term "GrammarTypeError"
|
|
> if grammar contains an object of unkown type
|
|
term "GrammarValueError"
|
|
> if grammar contains an illegal cardinality value
|
|
}
|
|
|
|
p > Example:
|
|
|
|
Code
|
|
||
|
|
>>> from pypeg2 import Parser, csl, word
|
|
>>> ◊p = Parser()◊
|
|
>>> ◊p.parse("hello, world!", csl(word))◊
|
|
('!', ['hello', 'world'])
|
|
||
|
|
|
|
|
|
h3 id=parser_compose > Method compose()
|
|
|
|
h4 > Synopsis
|
|
p > «compose(self, thing, grammar=None)»
|
|
|
|
p >>
|
|
Compose text using «thing» with «grammar». If «thing.compose()»
|
|
exists, execute it, otherwise use «grammar» to compose.
|
|
>>
|
|
|
|
h4 > Arguments
|
|
|
|
glossary {
|
|
term "thing" > «thing» containing other things with «grammar»
|
|
term "grammar" > «grammar» to use for composing «thing»; default: «type(thing).grammar»
|
|
}
|
|
|
|
h4 > Returns
|
|
|
|
p > Composed text
|
|
|
|
h4 > Raises
|
|
|
|
glossary {
|
|
term "ValueError" > if «thing» does not match «grammar»
|
|
term "GrammarTypeError" > if «grammar» contains an object of unkown type
|
|
term "GrammarValueError" > if «grammar» contains an illegal cardinality value
|
|
}
|
|
|
|
p > Example:
|
|
|
|
Code
|
|
||
|
|
>>> from pypeg2 import Parser, csl, word
|
|
>>> ◊p = Parser()◊
|
|
>>> ◊p.compose(['hello', 'world'], csl(word))◊
|
|
'hello, world'
|
|
||
|
|
|
|
h3 id=gen_syntax_error > Method generate_syntax_error()
|
|
|
|
h4 > Synopsis
|
|
p > «generate_syntax_error(self, msg, pos)»
|
|
|
|
p > Generate a syntax error construct.
|
|
|
|
glossary {
|
|
term "msg" > string with error message
|
|
term "pos" > «(lineNo, charInText)» with positioning information
|
|
}
|
|
|
|
h4 > Returns
|
|
p > Instance of «SyntaxError» with error text
|
|
|
|
h2 id=convenience > Convenience functions
|
|
|
|
h3 id=parse > Function parse()
|
|
|
|
h4 > Synopsis
|
|
pre
|
|
||
|
|
parse(text, thing, filename=None, whitespace=whitespace,
|
|
comment=None, keep_feeble_things=False)
|
|
||
|
|
|
|
p >>
|
|
Parse text following «thing» as grammar and return the resulting things or
|
|
raise an error.
|
|
>>
|
|
|
|
h4 > Arguments
|
|
|
|
glossary {
|
|
term "text"
|
|
> «text» to parse
|
|
term "thing"
|
|
> «grammar» for things to parse
|
|
term "filename"
|
|
> «filename» where «text» is origin from
|
|
term "whitespace"
|
|
> regular expression to skip «whitespace»; default: «re.compile(r"(?m)\s+")»
|
|
term "comment"
|
|
> «grammar» to parse comments; default: «None»
|
|
term "keep_feeble_things"
|
|
>>
|
|
keep otherwise cropped things like comments and whitespace; these
|
|
things are being put into the «feeble_things» attribute; default:
|
|
«False»
|
|
>>
|
|
}
|
|
|
|
h4 > Returns
|
|
p > generated things
|
|
|
|
h4 > Raises
|
|
|
|
glossary {
|
|
term "SyntaxError" > if «text» does not match the «grammar» in «thing»
|
|
term "ValueError" > if input does not match types
|
|
term "TypeError" > if output classes have wrong syntax for «__init__()»
|
|
term "GrammarTypeError"
|
|
> if «grammar» contains an object of unkown type
|
|
term "GrammarValueError"
|
|
> if «grammar» contains an illegal cardinality value
|
|
}
|
|
|
|
p > Example:
|
|
|
|
Code
|
|
||
|
|
>>> from pypeg2 import parse, csl, word
|
|
>>> ◊parse("hello, world", csl(word))◊
|
|
['hello', 'world']
|
|
||
|
|
|
|
h3 id=compose > Function compose()
|
|
|
|
h4 > Synopsis
|
|
p > «compose(thing, grammar=None, indent=" ", autoblank=True)»
|
|
|
|
p > Compose text using «thing» with «grammar».
|
|
|
|
h4 > Arguments
|
|
|
|
glossary {
|
|
term "thing" > «thing» containing other things with «grammar»
|
|
term "grammar" > «grammar» to use to compose thing; default: «thing.grammar»
|
|
term "indent" > string to use to indent while composing; default: four spaces
|
|
term "autoblank"
|
|
> add blanks if grammar would possibly be violated otherwise; default: True
|
|
}
|
|
|
|
h4 > Returns
|
|
|
|
p > composed text
|
|
|
|
h4 > Raises
|
|
|
|
glossary {
|
|
term "ValueError" > if input does not match «grammar»
|
|
term "GrammarTypeError"
|
|
> if «grammar» contains an object of unkown type
|
|
term "GrammarValueError"
|
|
> if «grammar» contains an illegal cardinality value
|
|
}
|
|
|
|
p > Example:
|
|
|
|
Code
|
|
||
|
|
>>> from pypeg2 import compose, csl, word
|
|
>>> ◊compose(['hello', 'world'], csl(word))◊
|
|
'hello, world'
|
|
||
|
|
|
|
h3 id=attributes > Function attributes()
|
|
|
|
h4 > Synopsis
|
|
p > «attributes(grammar, invisible=False)»
|
|
|
|
p > Iterates all attributes of a «grammar».
|
|
|
|
p >>
|
|
This function can be used to iterate through all attributes which
|
|
will be generated for the top level object of the «grammar». If
|
|
invisible is «False» omit attributes whose names are starting with
|
|
an underscore «_».
|
|
>>
|
|
|
|
p > Example:
|
|
|
|
Code
|
|
||
|
|
>>> from pypeg2 import attr, name, attributes, word, restline
|
|
>>> class Me:
|
|
... grammar = name(), attr("typing", word), restline
|
|
...
|
|
>>> for a in ◊attributes(Me.grammar)◊: print(a.name)
|
|
...
|
|
name
|
|
typing
|
|
>>>
|
|
||
|
|
|
|
h3 id=howmany > Function how_many()
|
|
|
|
h4 > Synopsis
|
|
p > «how_many(grammar)»
|
|
|
|
p > Determines the possibly parsed objects of grammar.
|
|
|
|
p >>
|
|
This function is meant to check if the results of a grammar
|
|
can be stored in a single object or a collection will be needed.
|
|
>>
|
|
|
|
h4 > Returns
|
|
|
|
glossary {
|
|
term "0" > if there will be no objects
|
|
term "1" > if there will be a maximum of one object
|
|
term "2" > if there can be more than one object
|
|
}
|
|
|
|
h4 > Raises
|
|
|
|
glossary {
|
|
term "GrammarTypeError"
|
|
> if «grammar» contains an object of unkown type
|
|
term "GrammarValueError"
|
|
> if «grammar» contains an illegal cardinality value
|
|
}
|
|
|
|
p > Example:
|
|
|
|
Code
|
|
||
|
|
>>> from pypeg2 import how_many, word, csl
|
|
>>> ◊how_many("some")◊
|
|
0
|
|
>>> ◊how_many(word)◊
|
|
1
|
|
>>> ◊how_many(csl(word))◊
|
|
2
|
|
||
|
|
|
|
h2 id=errors > Exceptions
|
|
|
|
h3 id=gerror > GrammarError
|
|
|
|
p >>
|
|
Base class for all errors ƒpyPEG delivers.
|
|
>>
|
|
|
|
h3 id=getype > GrammarTypeError
|
|
|
|
p >>
|
|
A grammar contains an object of a type which cannot be parsed,
|
|
for example an instance of an unknown class or of a basic type
|
|
like «float». It can be caused by an «int» at the wrong place, too.
|
|
>>
|
|
|
|
h3 id=gevalue > GrammarValueError
|
|
|
|
p >>
|
|
A grammar contains an object with an illegal value, for example
|
|
an undefined cardinality.
|
|
>>
|
|
|
|
div id="bottom" {
|
|
"Want to download? Go to the "
|
|
a "#top", "^Top^"; " and look to the right ;-)"
|
|
}
|
|
}
|