|
|
page "pyPEG – Grammar Elements", "counter-reset: chapter 1;" {
|
|
|
h1 id=gelements > Grammar Elements
|
|
|
|
|
|
p >>
|
|
|
ƒCaveat: pyPEG 2.x is written for Python 3. That means, it accepts
|
|
|
Unicode strings only. You can use it with Python 2.7 by writing
|
|
|
«u'string'» instead of «'string'» or with the following import (you
|
|
|
don't need that for Python 3):
|
|
|
>>
|
|
|
|
|
|
Code | from __future__ import unicode_literals
|
|
|
|
|
|
p >>
|
|
|
The samples in this documentation are written for Python 3, too. To
|
|
|
execute them with Python 2.7, you'll need this import:
|
|
|
>>
|
|
|
|
|
|
Code | from __future__ import print_function
|
|
|
|
|
|
p >>
|
|
|
pyPEG 2.x supports new-style classes only.
|
|
|
>>
|
|
|
|
|
|
h2 id=basic > Basic Grammar Elements
|
|
|
|
|
|
h3 id=literals > str instances and Literal
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
A «str» instance as well as an instance of «pypeg2.Literal» is parsed
|
|
|
in the source text as a
|
|
|
`w "Terminal_and_nonterminal_symbols" > Terminal Symbol`.
|
|
|
It is removed and no result is put into the ∫Abstract syntax tree∫.
|
|
|
If it does not exist at the correct position in the source text,
|
|
|
a «SyntaxError» is raised.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Key(str):
|
|
|
... grammar = name(), ◊"="◊, restline, endl
|
|
|
...
|
|
|
>>> k = parse("this=something", Key)
|
|
|
>>> k.name
|
|
|
Symbol('this')
|
|
|
>>> k
|
|
|
'something'
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
«str» instances and «pypeg2.Literal» instances are being output
|
|
|
literally.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Key(str):
|
|
|
... grammar = name(), ◊"="◊, restline, endl
|
|
|
...
|
|
|
>>> k = Key("a value")
|
|
|
>>> k.name = Symbol("give me")
|
|
|
>>> compose(k)
|
|
|
'give me◊=◊a value\\n'
|
|
|
||
|
|
|
|
|
|
h3 id=regex > Regular Expressions
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
ƒpyPEG uses Python's «re» module. You can use
|
|
|
πre.html#re-objects Python Regular Expression Objectsπ purely, or use
|
|
|
the «pypeg2.RegEx» encapsulation. Regular Expressions are parsed as
|
|
|
`w "Terminal_and_nonterminal_symbols" > Terminal Symbols`. The matching
|
|
|
result is put into the AST. If no match can be achieved, a
|
|
|
«SyntaxError» is raised.
|
|
|
>>
|
|
|
|
|
|
p >>
|
|
|
ƒpyPEG predefines different RegEx objects:
|
|
|
>>
|
|
|
|
|
|
glossary {
|
|
|
term 'word = re.compile(r"\w+")'
|
|
|
> Regular expression for scanning a word.
|
|
|
term 'restline = re.compile(r".*")'
|
|
|
> Regular expression for rest of line.
|
|
|
term 'whitespace = re.compile("(?m)\s+")'
|
|
|
> Regular expression for scanning whitespace.
|
|
|
term 'comment_sh = re.compile(r"\#.*")'
|
|
|
> Shell script style comment.
|
|
|
term 'comment_cpp = re.compile(r"//.*")'
|
|
|
> C++ style comment.
|
|
|
term 'comment_c = re.compile(r"(?m)/\*.*?\*/")'
|
|
|
> C style comment without nesting.
|
|
|
term 'comment_pas = re.compile(r"(?m)\(\*.*?\*\)")'
|
|
|
> Pascal style comment without nesting.
|
|
|
}
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Key(str):
|
|
|
... grammar = name(), "=", ◊restline◊, endl
|
|
|
...
|
|
|
>>> k = parse("this=something", Key)
|
|
|
>>> k.name
|
|
|
Symbol('this')
|
|
|
>>> k
|
|
|
◊'something'◊
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
For «RegEx» objects their corresponding value in the AST will be
|
|
|
output. If this value does not match the «RegEx» a «ValueError» is raised.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Key(str):
|
|
|
... grammar = name(), "=", ◊restline◊, endl
|
|
|
...
|
|
|
>>> k = Key(◊"a value"◊)
|
|
|
>>> k.name = Symbol("give me")
|
|
|
>>> compose(k)
|
|
|
'give me=◊a value\\n◊'
|
|
|
||
|
|
|
|
|
|
h3 id=tuple > tuple instances and Concat
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
A «tuple» or an instance of «pypeg2.Concat» specifies, that different
|
|
|
things have to be parsed one after another. If not all of them parse in
|
|
|
their sequence, a «SyntaxError» is raised.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Key(str):
|
|
|
... grammar = name()◊, ◊"="◊, ◊restline◊, ◊endl
|
|
|
...
|
|
|
>>> k = parse("this=something", Key)
|
|
|
>>> k.name
|
|
|
Symbol('this')
|
|
|
>>> k
|
|
|
'something'
|
|
|
||
|
|
|
|
|
|
p >>
|
|
|
In a «tuple» there may be integers preceding another thing in the
|
|
|
«tuple». These integers represent a cardinality. For example, to parse
|
|
|
three times a «word», you can have as a «grammar»:
|
|
|
>>
|
|
|
|
|
|
Code | grammar = word, word, word
|
|
|
|
|
|
p > or:
|
|
|
|
|
|
Code | grammar = 3, word
|
|
|
|
|
|
p > which is equivalent. There are special cardinality values:
|
|
|
|
|
|
glossary {
|
|
|
term "-2, thing"
|
|
|
> «some(thing)»; this represents the plus cardinality, +
|
|
|
term "-1, thing"
|
|
|
> «maybe_some(thing)»; this represents the asterisk cardinality, *
|
|
|
term "0, thing"
|
|
|
> «optional(thing)»; this represents the question mark cardinality, ?
|
|
|
}
|
|
|
|
|
|
p >>
|
|
|
The special cardinality values can be generated with the
|
|
|
¬#some Cardinality Functions¬. Other negative values are reserved
|
|
|
and may not be used.
|
|
|
>>
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
For «tuple» instances and instances of «pypeg2.Concat» all attributes of
|
|
|
the corresponding thing (and elements of the corresponding collection
|
|
|
if that applies) in the AST will be composed and the result is
|
|
|
concatenated.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Key(str):
|
|
|
... grammar = name()◊, ◊"="◊, ◊restline◊, ◊endl
|
|
|
...
|
|
|
>>> k = Key("a value")
|
|
|
>>> k.name = Symbol("give me")
|
|
|
>>> compose(k)
|
|
|
◊'give me=a value\\n'◊
|
|
|
||
|
|
|
|
|
|
h3 id=lists > list instances
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
A «list» instance which is not derived from «pypeg2.Concat» represents
|
|
|
different options. They're tested in their sequence. The first option
|
|
|
which parses is chosen, the others are not tested any more. If none
|
|
|
matches, a «SyntaxError» is raised.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> number = re.compile(r"\d+")
|
|
|
>>> parse("hello", ◊[number, word]◊)
|
|
|
'hello'
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
The elements of the «list» are tried out in their sequence, if one of
|
|
|
them can be composed. If none can a «ValueError» is raised.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> letters = re.compile(r"[a-zA-Z]")
|
|
|
>>> number = re.compile(r"\d+")
|
|
|
>>> compose(23, ◊[letters, number]◊)
|
|
|
'23'
|
|
|
||
|
|
|
|
|
|
h3 id=none > Constant None
|
|
|
|
|
|
p >>
|
|
|
«None» parses to nothing. And it composes to nothing. It represents
|
|
|
the no-operation value.
|
|
|
>>
|
|
|
|
|
|
h2 id=goclasses > Grammar Element Classes
|
|
|
|
|
|
h3 id=symbol > Class Symbol
|
|
|
|
|
|
h4 > Class definition
|
|
|
p > «Symbol(str)»
|
|
|
|
|
|
p > Used to scan a «Symbol».
|
|
|
|
|
|
p >>
|
|
|
If you're putting a «Symbol» somewhere in your «grammar», then
|
|
|
«Symbol.regex» is used to scan while parsing. The result will be a
|
|
|
«Symbol» instance. Optionally it is possible to check that a «Symbol»
|
|
|
instance will not be identical to any «Keyword» instance. This can be
|
|
|
helpful if the source language forbids that.
|
|
|
>>
|
|
|
|
|
|
p >>
|
|
|
A class which is derived from «Symbol» can have an «Enum» as its
|
|
|
«grammar» only. Other values for its «grammar» are forbidden and will
|
|
|
raise a «TypeError». If such an «Enum» is specified, each parsed value
|
|
|
will be checked if being a member of this «Enum» additionally to the
|
|
|
«RegEx» matching.
|
|
|
>>
|
|
|
|
|
|
h4 > Class variables
|
|
|
|
|
|
glossary {
|
|
|
term "regex"
|
|
|
> regular expression to scan, default «re.compile(r"\w+")»
|
|
|
term "check_keywords"
|
|
|
> flag if a «Symbol» has to be checked for not being a «Keyword»; default: «False»
|
|
|
}
|
|
|
|
|
|
h4 > Instance variables
|
|
|
|
|
|
glossary
|
|
|
term "name" > name of the «Keyword» as «str» instance
|
|
|
|
|
|
h4 > Method «__init__(self, name, namespace=None)»
|
|
|
|
|
|
p > Construct a «Symbol» with that «name» in «namespace».
|
|
|
|
|
|
h5 > Raises:
|
|
|
|
|
|
glossary {
|
|
|
term "ValueError"
|
|
|
> if «check_keywords» is «True» and value is identical to a «Keyword»
|
|
|
term "TypeError"
|
|
|
> if «namespace» is given and not an instance of «Namespace»
|
|
|
}
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
Parsing a «Symbol» is done by scanning with «Symbol.regex». In our
|
|
|
example we're using the «name()» function, which is often used to parse
|
|
|
a «Symbol». «name()» equals to «attr("name", Symbol)».
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> ◊Symbol.regex = re.compile(r"[\w\s]+")◊
|
|
|
>>> class Key(str):
|
|
|
... grammar = ◊name()◊, "=", restline, endl
|
|
|
...
|
|
|
>>> k = parse("this one=foo bar", Key)
|
|
|
>>> k.name
|
|
|
◊Symbol('this one')◊
|
|
|
>>> k
|
|
|
'foo bar'
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p > Composing a «Symbol» is done by converting it to text.
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> k.name = ◊Symbol("that one")◊
|
|
|
>>> compose(k)
|
|
|
'◊that one◊=foo bar'
|
|
|
||
|
|
|
|
|
|
h3 id=keyword > Class Keyword
|
|
|
|
|
|
h4 > Class definition
|
|
|
p > «Keyword(Symbol)»
|
|
|
|
|
|
p > Used to access the keyword table.
|
|
|
|
|
|
p >>
|
|
|
The «Keyword» class is meant to be instanciated for each «Keyword» of
|
|
|
the source language. The class holds the keyword table as a «Namespace»
|
|
|
instance. There is the abbreviation «K» for «Keyword». The latter is
|
|
|
useful for instancing keywords.
|
|
|
>>
|
|
|
|
|
|
h4 > Class variables
|
|
|
|
|
|
glossary {
|
|
|
term "regex" > regular expression to scan; default «re.compile(r"\w+")»
|
|
|
term "table" > «Namespace» with keyword table
|
|
|
}
|
|
|
|
|
|
h4 > Instance variables
|
|
|
|
|
|
glossary
|
|
|
term "name" > name of the «Keyword» as «str» instance
|
|
|
|
|
|
h4 > Method «__init__(self, keyword)»
|
|
|
|
|
|
p > Adds «keyword» to the keyword table.
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
When a «Keyword» instance is parsed, it is removed and nothing is put
|
|
|
into the resulting AST. When a «Keyword» class is parsed, an
|
|
|
instance is created and put into the AST.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class ◊Type(Keyword)◊:
|
|
|
... grammar = ◊Enum( K("int"), K("long") )◊
|
|
|
...
|
|
|
>>> k = parse("long", ◊Type◊)
|
|
|
>>> k.name
|
|
|
'long'
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
When a «Keyword» instance is in a «grammar», it is converted into a
|
|
|
«str» instance, and the resulting text is added to the result. When a
|
|
|
«Keyword» class is in the «grammar», the correspoding instance in the
|
|
|
AST is converted into a «str» instance and added to the result.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> k = ◊K("do")◊
|
|
|
>>> compose(k)
|
|
|
'do'
|
|
|
||
|
|
|
|
|
|
h3 id=list > Class List
|
|
|
|
|
|
h4 > Class definition
|
|
|
p > «List(list)»
|
|
|
|
|
|
p > A List of things.
|
|
|
|
|
|
p >>
|
|
|
A «List» is a collection for parsed things. It can be used as a base class
|
|
|
for collections in the «grammar». If a «List» class has no class
|
|
|
variable «grammar», «grammar = csl(Symbol)» is assumed.
|
|
|
>>
|
|
|
|
|
|
h4 > Method «__init__(self, L=[], **kwargs)»
|
|
|
|
|
|
p >>
|
|
|
Construct a List, and construct its attributes from keyword
|
|
|
arguments.
|
|
|
>>
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
A «List» is parsed by following its «grammar». If a «List» is parsed,
|
|
|
then all things which are parsed and which are not attributes are
|
|
|
appended to the «List».
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Instruction(str): pass
|
|
|
...
|
|
|
>>> class ◊Block(List)◊:
|
|
|
... grammar = "{", maybe_some(Instruction), "}"
|
|
|
...
|
|
|
>>> b = parse("{ ◊hello world◊ }", ◊Block◊)
|
|
|
>>> b◊[0]◊
|
|
|
'hello'
|
|
|
>>> b◊[1]◊
|
|
|
'world'
|
|
|
>>>
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
If a «List» is composed, then its grammar is followed and composed.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Instruction(str): pass
|
|
|
...
|
|
|
>>> class ◊Block(List)◊:
|
|
|
... grammar = "{", blank, csl(Instruction), blank, "}"
|
|
|
...
|
|
|
>>> b = Block()
|
|
|
>>> b.◊append(Instruction("hello"))◊
|
|
|
>>> b.◊append(Instruction("world"))◊
|
|
|
>>> compose(b)
|
|
|
'{ hello, world }'
|
|
|
||
|
|
|
|
|
|
h3 id=namespace > Class Namespace
|
|
|
|
|
|
h4 > Class definition
|
|
|
p > «Namespace(_UserDict)»
|
|
|
|
|
|
p > A dictionary of things, indexed by their name.
|
|
|
|
|
|
p >>
|
|
|
A Namespace holds an «OrderedDict» mapping the «name» attributes of the
|
|
|
collected things to their respective representation instance. Unnamed
|
|
|
things cannot be collected with a «Namespace».
|
|
|
>>
|
|
|
|
|
|
h4 > Method «__init__(self, *args, **kwargs)»
|
|
|
|
|
|
p >>
|
|
|
Initialize an OrderedDict containing the data of the Namespace.
|
|
|
Arguments are put into the Namespace, keyword arguments give the
|
|
|
attributes of the Namespace.
|
|
|
>>
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
A «Namespace» is parsed by following its «grammar». If a «Namespace» is
|
|
|
parsed, then all things which are parsed and which are not attributes
|
|
|
are appended to the «Namespace» and indexed by their «name»
|
|
|
attribute.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> Symbol.regex = re.compile(r"[\w\s]+")
|
|
|
>>> class Key(str):
|
|
|
... grammar = ◊name()◊, "=", restline, endl
|
|
|
...
|
|
|
>>> class Section(◊Namespace◊):
|
|
|
... grammar = "[", ◊name()◊, "]", endl, maybe_some(Key)
|
|
|
...
|
|
|
>>> class IniFile(◊Namespace◊):
|
|
|
... grammar = some(Section)
|
|
|
...
|
|
|
>>> ini_file_text = """[Number 1]
|
|
|
... this=something
|
|
|
... that=something else
|
|
|
... [Number 2]
|
|
|
... once=anything
|
|
|
... twice=goes
|
|
|
... """
|
|
|
>>> ini_file = parse(ini_file_text, IniFile)
|
|
|
>>> ini_file◊["Number 2"]["once"]◊
|
|
|
'anything'
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
If a «Namespace» is composed, then its grammar is followed and
|
|
|
composed.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> ini_file◊["Number 1"]["that"]◊ = Key("new one")
|
|
|
>>> ini_file◊["Number 3"]◊ = Section()
|
|
|
>>> print(◊compose(ini_file)◊)
|
|
|
[Number 1]
|
|
|
this=something
|
|
|
that=new one
|
|
|
[Number 2]
|
|
|
once=anything
|
|
|
twice=goes
|
|
|
[Number 3]
|
|
|
||
|
|
|
|
|
|
h3 id=enum > Class Enum
|
|
|
|
|
|
h4 > Class definition
|
|
|
p > «Enum(Namespace)»
|
|
|
|
|
|
p >>
|
|
|
A Namespace which is treated as an Enum. Enums can only contain
|
|
|
«Keyword» or «Symbol» instances. An «Enum» cannot be modified after
|
|
|
creation. An «Enum» is allowed as the grammar of a «Symbol» only.
|
|
|
>>
|
|
|
|
|
|
h4 > Method «__init__(self, *things)»
|
|
|
|
|
|
p > Construct an «Enum» using a «tuple» of things.
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
An «Enum» is parsed as a selection for possible values for a «Symbol».
|
|
|
If a value is parsed which is not member of the «Enum», a «SyntaxError»
|
|
|
is raised.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Type(Keyword):
|
|
|
... grammar = ◊Enum( K("int"), K("long") )◊
|
|
|
...
|
|
|
>>> parse("int", Type)
|
|
|
Type('int')
|
|
|
>>> parse("string", Type)
|
|
|
Traceback (most recent call last):
|
|
|
File "<stdin>", line 1, in <module>
|
|
|
File "pypeg2/__init__.py", line 382, in parse
|
|
|
t, r = parser.parse(text, thing)
|
|
|
File "pypeg2/__init__.py", line 469, in parse
|
|
|
raise r
|
|
|
File "<string>", line 1
|
|
|
string
|
|
|
^
|
|
|
SyntaxError: 'string' is not a member of Enum([Keyword('int'),
|
|
|
Keyword('long')])
|
|
|
>>>
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
When a «Symbol» is composed which has an «Enum» as its grammar, the
|
|
|
composed value is checked if it is a member of the «Enum». If not, a
|
|
|
«ValueError» is raised.
|
|
|
>>
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Type(Keyword):
|
|
|
... grammar = ◊Enum( K("int"), K("long") )◊
|
|
|
...
|
|
|
>>> t = Type("int")
|
|
|
>>> compose(t)
|
|
|
'int'
|
|
|
>>> t = Type("string")
|
|
|
>>> compose(t)
|
|
|
Traceback (most recent call last):
|
|
|
File "<stdin>", line 1, in <module>
|
|
|
File "pypeg2/__init__.py", line 403, in compose
|
|
|
return parser.compose(thing, grammar)
|
|
|
File "pypeg2/__init__.py", line 819, in compose
|
|
|
raise ValueError(repr(thing) + " is not in " + repr(grammar))
|
|
|
ValueError: Type('string') is not in Enum([Keyword('int'),
|
|
|
Keyword('long')])
|
|
|
||
|
|
|
|
|
|
h2 id=ggfunc > Grammar generator functions
|
|
|
|
|
|
p >>
|
|
|
Grammar generator function generate a piece of a «grammar». They're
|
|
|
meant to be used in a «grammar» directly.
|
|
|
>>
|
|
|
|
|
|
h3 id=some > Function some()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «some(*thing)»
|
|
|
|
|
|
p >>
|
|
|
At least one occurrence of thing, + operator. Inserts «-2» as
|
|
|
cardinality before thing.
|
|
|
>>
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
Parsing «some()» parses at least one occurence of «thing», or as many
|
|
|
as there are. If there aren't things then a «SyntaxError» is generated.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> w = parse("hello world", ◊some(word)◊)
|
|
|
>>> w
|
|
|
['hello', 'world']
|
|
|
>>> w = parse("", ◊some(word)◊)
|
|
|
Traceback (most recent call last):
|
|
|
File "<stdin>", line 1, in <module>
|
|
|
File "pypeg2/__init__.py", line 390, in parse
|
|
|
t, r = parser.parse(text, thing)
|
|
|
File "pypeg2/__init__.py", line 477, in parse
|
|
|
raise r
|
|
|
File "<string>", line 1
|
|
|
|
|
|
^
|
|
|
SyntaxError: expecting match on \w+
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
Composing «some()» composes as many things as there are, but at least
|
|
|
one. If there is no matching thing, a «ValueError» is raised.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Words(List):
|
|
|
... grammar = ◊some(word, blank)◊
|
|
|
...
|
|
|
>>> compose(Words("hello", "world"))
|
|
|
'hello world '
|
|
|
>>> compose(Words())
|
|
|
Traceback (most recent call last):
|
|
|
File "<stdin>", line 1, in <module>
|
|
|
File "pypeg2/__init__.py", line 414, in compose
|
|
|
return parser.compose(thing, grammar)
|
|
|
File "pypeg2/__init__.py", line 931, in compose
|
|
|
result = compose_tuple(thing, thing[:], grammar)
|
|
|
File "pypeg2/__init__.py", line 886, in compose_tuple
|
|
|
raise ValueError("not enough things to compose")
|
|
|
ValueError: not enough things to compose
|
|
|
>>>
|
|
|
||
|
|
|
|
|
|
h3 id=maybesome > Function maybe_some()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «maybe_some(*thing)»
|
|
|
|
|
|
p >>
|
|
|
No thing or some of them, * operator. Inserts «-1» as cardinality
|
|
|
before thing.
|
|
|
>>
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
Parsing «maybe_some()» parses all occurrences of «thing». If there
|
|
|
aren't things then the result is empty.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> parse("hello world", ◊maybe_some(word)◊)
|
|
|
['hello', 'world']
|
|
|
>>> parse("", ◊maybe_some(word)◊)
|
|
|
[]
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p > Composing «maybe_some()» composes as many things as there are.
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Words(List):
|
|
|
... grammar = ◊maybe_some(word, blank)◊
|
|
|
...
|
|
|
>>> compose(Words("hello", "world"))
|
|
|
'hello world '
|
|
|
>>> compose(Words())
|
|
|
''
|
|
|
||
|
|
|
|
|
|
h3 id=optional > Function optional()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «optional(*thing)»
|
|
|
|
|
|
p > Thing or no thing, ? operator. Inserts «0» as cardinality before thing.
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
Parsing «optional()» parses one occurrence of «thing». If there
|
|
|
aren't things then the result is empty.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> parse("hello", ◊optional(word)◊)
|
|
|
['hello']
|
|
|
>>> parse("", ◊optional(word)◊)
|
|
|
[]
|
|
|
>>> number = re.compile("[-+]?\d+")
|
|
|
>>> parse("-23 world", (◊optional(word)◊, number, word))
|
|
|
['-23', 'world']
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p > Composing «optional()» composes one thing if there is any.
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class OptionalWord(str):
|
|
|
... grammar = ◊optional(word)◊
|
|
|
...
|
|
|
>>> compose(OptionalWord("hello"))
|
|
|
'hello'
|
|
|
>>> compose(OptionalWord())
|
|
|
''
|
|
|
||
|
|
|
|
|
|
h3 id=csl > Function csl()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
|
|
|
h5 > Python 3.x:
|
|
|
p > «csl(*thing, separator=",")»
|
|
|
|
|
|
h5 > Python 2.7:
|
|
|
p > «csl(*thing)»
|
|
|
|
|
|
p > Generate a grammar for a simple comma separated list.
|
|
|
|
|
|
p >>
|
|
|
«csl(Something)» generates
|
|
|
«Something, maybe_some(",", blank, Something)»
|
|
|
>>
|
|
|
|
|
|
h3 id=attr > Function attr()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «attr(name, thing=word, subtype=None)»
|
|
|
|
|
|
p >>
|
|
|
Generate an «Attribute» with that «name», referencing the «thing». An
|
|
|
«Attribute» is a «namedtuple("Attribute", ("name", "thing"))».
|
|
|
>>
|
|
|
|
|
|
h4 > Instance variables
|
|
|
|
|
|
glossary
|
|
|
term "Class" > reference to «Attribute» class generated by «namedtuple()»
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
An «Attribute» is parsed following its grammar in «thing». The result
|
|
|
is not put into another thing directly; instead the result is added as
|
|
|
an attribute to containing thing.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Type(Keyword):
|
|
|
... grammar = Enum( K("int"), K("long") )
|
|
|
...
|
|
|
>>> class Parameter:
|
|
|
... grammar = ◊attr("typing", Type)◊, blank, name()
|
|
|
...
|
|
|
>>> p = parse("int a", Parameter)
|
|
|
>>> ◊p.typing◊
|
|
|
Type('int')
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p > An «Attribute» is cmposed following its grammar in «thing».
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> p = Parameter()
|
|
|
>>> ◊p.typing◊ = K("int")
|
|
|
>>> p.name = "x"
|
|
|
>>> compose(p)
|
|
|
'int x'
|
|
|
||
|
|
|
|
|
|
h3 id=flag > Function flag()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «flag(name, thing=None)»
|
|
|
|
|
|
p >>
|
|
|
Generate an «Attribute» with that «name» which is valued «True» or
|
|
|
«False». If no «thing» is given, «Keyword(name)» is assumed.
|
|
|
>>
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
A «flag» is usually a «Keyword» which can be there or not. If it is
|
|
|
there, the resulting value is «True». If it is not there, the resulting
|
|
|
value is «False».
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class BoolLiteral(Symbol):
|
|
|
... grammar = Enum( K("True"), K("False") )
|
|
|
...
|
|
|
>>> class Fact:
|
|
|
... grammar = name(), K("is"), ◊flag("negated", K("not"))◊, \\
|
|
|
... attr("value", BoolLiteral)
|
|
|
...
|
|
|
>>> f1 = parse("a is not True", Fact)
|
|
|
>>> f2 = parse("b is False", Fact)
|
|
|
>>> f1.name
|
|
|
Symbol('a')
|
|
|
>>> f1.value
|
|
|
BoolLiteral('True')
|
|
|
>>> ◊f1.negated◊
|
|
|
True
|
|
|
>>> ◊f2.negated◊
|
|
|
False
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
If the «flag» is «True» compose the grammar. If the «flag» is «False»
|
|
|
don't compose anything.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class ValidSign:
|
|
|
... grammar = ◊flag("invalid", K("not"))◊, blank, "valid"
|
|
|
...
|
|
|
>>> v = ValidSign()
|
|
|
>>> ◊v.invalid = True◊
|
|
|
>>> compose(v)
|
|
|
'◊not◊ valid'
|
|
|
||
|
|
|
|
|
|
h3 id=name > Function name()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «name()»
|
|
|
|
|
|
p >>
|
|
|
Generate a grammar for a Symbol with a name. This is a shortcut for
|
|
|
«attr("name", Symbol)».
|
|
|
>>
|
|
|
|
|
|
h3 id=ignore > Function ignore()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «ignore(*grammar)»
|
|
|
|
|
|
p > Ignore what matches to the grammar.
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
Parse what's to be ignored. The result is added to an attribute
|
|
|
named «"_ignore" + str(i)» with i as a serial number.
|
|
|
>>
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
Compose the result as with any «attr()».
|
|
|
>>
|
|
|
|
|
|
h3 id=indent > Function indent()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «indent(*thing)»
|
|
|
|
|
|
p >>
|
|
|
Indent thing by one level.
|
|
|
>>
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
The «indent» function has no meaning while parsing. The parameters are
|
|
|
parsed as if they would be in a «tuple».
|
|
|
>>
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
While composing the «indent» function increases the level of indention.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> class Instruction(str):
|
|
|
... grammar = word, ";", endl
|
|
|
...
|
|
|
>>> class Block(List):
|
|
|
... grammar = "{", endl, maybe_some(◊indent(Instruction)◊), "}"
|
|
|
...
|
|
|
>>> print(compose(Block(Instruction("first"), \\
|
|
|
... Instruction("second"))))
|
|
|
{
|
|
|
◊ first;◊
|
|
|
◊ second;◊
|
|
|
}
|
|
|
||
|
|
|
|
|
|
h3 id=contiguous > Function contiguous()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «contiguous(*thing)»
|
|
|
|
|
|
p >>
|
|
|
Temporary disable automated whitespace removing while parsing «thing».
|
|
|
>>
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
While parsing whitespace removing is disabled. That means, if
|
|
|
whitespace is not part of the grammar, it will lead to a «SyntaxError»
|
|
|
if whitespace will be found between the parsed objects.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
class Path(List):
|
|
|
grammar = flag("relative", "."), maybe_some(Symbol, ".")
|
|
|
|
|
|
class Reference(GrammarElement):
|
|
|
grammar = ◊contiguous(◊attr("path", Path), name()◊)◊
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
While composing the «contiguous» function has no effect.
|
|
|
>>
|
|
|
|
|
|
h3 id=separated > Function separated()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «separated(*thing)»
|
|
|
|
|
|
p >>
|
|
|
Temporary enable automated whitespace removing while parsing «thing».
|
|
|
Whitespace removing is enabled by default. This function is for
|
|
|
temporary enabling whitespace removing after it was disabled with the
|
|
|
«contiguous» function.
|
|
|
>>
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
While parsing whitespace removing is enabled again. That means, if
|
|
|
whitespace is not part of the grammar, it will be omitted if whitespace
|
|
|
will be found between parsed objects.
|
|
|
>>
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
While composing the «separated» function has no effect.
|
|
|
>>
|
|
|
|
|
|
h3 id=omit > Function omit()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «omit(*thing)»
|
|
|
|
|
|
p >>
|
|
|
Omit what matches the grammar. This function cuts out «thing» and
|
|
|
throws it away.
|
|
|
>>
|
|
|
|
|
|
h4 > Parsing
|
|
|
|
|
|
p >>
|
|
|
While parsing «omit()» cuts out what matches the grammar «thing» and
|
|
|
throws it away.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> p = parse("hello", omit(Symbol))
|
|
|
>>> print(p)
|
|
|
None
|
|
|
>>> _
|
|
|
||
|
|
|
|
|
|
h4 > Composing
|
|
|
|
|
|
p >>
|
|
|
While composing «omit()» does not compose text for what matches the
|
|
|
grammar «thing».
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code
|
|
|
||
|
|
|
>>> compose(Symbol('hello'), omit(Symbol))
|
|
|
''
|
|
|
>>> _
|
|
|
||
|
|
|
|
|
|
h2 id=callbacks > Callback functions
|
|
|
|
|
|
p >>
|
|
|
Callback functions are called while composing only. They're ignored
|
|
|
while parsing.
|
|
|
>>
|
|
|
|
|
|
h3 id=blank > Callback function blank()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «blank(thing, parser)»
|
|
|
|
|
|
p > Space marker for composing text.
|
|
|
|
|
|
p > «blank» is outputting a space character (ASCII 32) when called.
|
|
|
|
|
|
h3 id=endl > Callback function endl()
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «endl(thing, parser)»
|
|
|
|
|
|
p > End of line marker for composing text.
|
|
|
|
|
|
p >>
|
|
|
«endl» is outputting a linefeed charater (ASCII 10) when called. The
|
|
|
indention system reacts when reading «endl» while composing.
|
|
|
>>
|
|
|
|
|
|
h3 id=udcf > User defined callback functions
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «callback_function(thing, parser)»
|
|
|
|
|
|
p >>
|
|
|
Arbitrary callback functions can be defined and put into the «grammar».
|
|
|
They will be called while composing.
|
|
|
>>
|
|
|
|
|
|
p > Example:
|
|
|
|
|
|
Code {
|
|
|
""">>> class Instruction(str):
|
|
|
... ◊def heading(self, parser):◊
|
|
|
... ◊ return "/* on level " + str(parser.indention_level) \\\\◊
|
|
|
... ◊ + " */", endl◊
|
|
|
... grammar = ◊heading◊, word, ";", endl
|
|
|
...
|
|
|
>>> print(compose(Instruction("do_this")))
|
|
|
◊/* on level 0 */◊
|
|
|
do_this;
|
|
|
"""
|
|
|
}
|
|
|
|
|
|
h2 id=common > Common class methods for grammar elements
|
|
|
|
|
|
p >>
|
|
|
If a method of the following is present in a grammar element, it will
|
|
|
override the standard behaviour.
|
|
|
>>
|
|
|
|
|
|
h3 id=override_parse > parse() class method of a grammar element
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «parse(cls, parser, text, pos)»
|
|
|
|
|
|
p >>
|
|
|
Overwrites the parsing behaviour. If present, this class method is
|
|
|
called at each place the grammar references the grammar element instead
|
|
|
of automatic parsing.
|
|
|
>>
|
|
|
|
|
|
glossary {
|
|
|
term "cls" > class object of the grammar element
|
|
|
term "parser" > parser object which is calling
|
|
|
term "text" > text to be parsed
|
|
|
term "pos" > «(lineNo, charInText)» with positioning information
|
|
|
}
|
|
|
|
|
|
h3 id=override_compose > compose() method of a grammar element
|
|
|
|
|
|
h4 > Synopsis
|
|
|
p > «compose(cls, parser)»
|
|
|
|
|
|
p >>
|
|
|
Overwrites the composing behaviour. If present, this class method is
|
|
|
called at each place the grammar references the grammar element instead
|
|
|
of automatic composing.
|
|
|
>>
|
|
|
|
|
|
glossary {
|
|
|
term "cls" > class object of the grammar element
|
|
|
term "parser" > parser object which is calling
|
|
|
}
|
|
|
|
|
|
div id="bottom" {
|
|
|
"Want to download? Go to the "
|
|
|
a "#top", "^Top^"; " and look to the right ;-)"
|
|
|
}
|
|
|
}
|