plainbox.impl.xparsers
– parsers for various plainbox formats¶
This module contains parsers for several formats that plainbox has to deal with. They are not real parsers (as they can be handled with simple regular expressions most of the time) but rather simple top-down parsing snippets spread around some classes.
What is interesting though, is the set of classes and their relationships (and attributes) as that helps to work with the code.
Node and Visitor¶
The basic class for everything parsed is Node
. It contains two
attributes, Node.lineno
and Node.col_offset
(mimicking the
python AST) and a similar, but not identical visitor mechanism. The precise way
in which the visitor class operates is documented on Visitor
. In
general application code can freely explore (but not modify as everything is
strictly read-only) the AST.
Regular expressions¶
We have to deal with regular expressions in many places so there’s a dedicated
AST node for handling them. The root class is Re
but it’s just a base
for one of the three concrete sub-classes ReErr
, ReFixed
and
RePattern
. ReErr
is an error wrapper (when the regular expression
is incorrect and doesn’t work) and the other two (which also share a common
base class ReOk
) can be used to do text matching. Since other parts of
the code already contain optimizations for regular expressions that are just a
plain string comparison there is a special class to highlight that fact
(ReFixed
)
White Lists¶
White lists are a poor man’s test plan which describes a list of regular
expressions with optional comments. The root class is WhiteList
who’s
WhiteList.entries
attribute contains a sequence of either
Comment
or a subclass of Re
.
-
class
plainbox.impl.xparsers.
Comment
(*args, **kwargs)[source]¶ Bases:
plainbox.impl.xparsers.Node
node representing single comment
-
as_dict
() → dict¶ Return the data in this POD as a dictionary.
Note
UNSET values are not added to the dictionary.
-
as_tuple
() → tuple¶ Return the data in this POD as a tuple.
Order of elements in the tuple corresponds to the order of field declarations.
-
col_offset
¶ Column offset (0-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
comment
¶ comment text, including any comment markers
- Side effects of assign filters:
- type-checked (value must be of type str)
- constant (read-only after initialization)
-
enumerate_entries
() → 'Generator[node]'¶
-
field_list
= [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'comment'>]¶
-
lineno
¶ Line number (1-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
visit
(visitor: 'Visitor')¶ Visit all of the sub-nodes reachable from this node
Parameters: visitor – Visitor object that gets to explore this and all the other nodes Returns: The return value of the visitor’s Visitor.visit()
method, if any. The default visitor doesn’t return anything.
-
-
class
plainbox.impl.xparsers.
Node
(*args, **kwargs)[source]¶ Bases:
plainbox.impl.pod.POD
base node type
-
as_dict
() → dict¶ Return the data in this POD as a dictionary.
Note
UNSET values are not added to the dictionary.
-
as_tuple
() → tuple¶ Return the data in this POD as a tuple.
Order of elements in the tuple corresponds to the order of field declarations.
-
col_offset
¶ Column offset (0-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
field_list
= [<Field name:'lineno'>, <Field name:'col_offset'>]¶
-
lineno
¶ Line number (1-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
visit
(visitor: 'Visitor')[source]¶ Visit all of the sub-nodes reachable from this node
Parameters: visitor – Visitor object that gets to explore this and all the other nodes Returns: The return value of the visitor’s Visitor.visit()
method, if any. The default visitor doesn’t return anything.
-
-
class
plainbox.impl.xparsers.
Re
(*args, **kwargs)[source]¶ Bases:
plainbox.impl.xparsers.Node
node representing a regular expression
-
as_dict
() → dict¶ Return the data in this POD as a dictionary.
Note
UNSET values are not added to the dictionary.
-
as_tuple
() → tuple¶ Return the data in this POD as a tuple.
Order of elements in the tuple corresponds to the order of field declarations.
-
col_offset
¶ Column offset (0-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
enumerate_entries
() → 'Generator[node]'¶
-
field_list
= [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'text'>]¶
-
lineno
¶ Line number (1-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
static
parse
(text: str, lineno: int=0, col_offset: int=0) → 'Re'[source]¶ Parse a bit of text and return a concrete subclass of
Re
Parameters: text – The text to parse Returns: If text
is a correct regular expression then an instance ofReOk
is returned. In practice exactly one ofReFixed
orRePattern
may be returned. Iftext
is incorrect then an instance ofReErr
is returned.Examples:
>>> Re.parse("text") ReFixed(text='text')
>>> Re.parse("pa[tT]ern") RePattern(text='pa[tT]ern', re=re.compile('pa[tT]ern'))
>>> from sre_constants import error >>> Re.parse("+") ReErr(text='+', exc=error('nothing to repeat',))
-
text
¶ Text of the regular expression (perhaps invalid)
- Side effects of assign filters:
- type-checked (value must be of type str)
- constant (read-only after initialization)
-
visit
(visitor: 'Visitor')¶ Visit all of the sub-nodes reachable from this node
Parameters: visitor – Visitor object that gets to explore this and all the other nodes Returns: The return value of the visitor’s Visitor.visit()
method, if any. The default visitor doesn’t return anything.
-
-
class
plainbox.impl.xparsers.
ReErr
(*args, **kwargs)[source]¶ Bases:
plainbox.impl.xparsers.Re
node representing an incorrect regular expression
-
as_dict
() → dict¶ Return the data in this POD as a dictionary.
Note
UNSET values are not added to the dictionary.
-
as_tuple
() → tuple¶ Return the data in this POD as a tuple.
Order of elements in the tuple corresponds to the order of field declarations.
-
col_offset
¶ Column offset (0-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
enumerate_entries
() → 'Generator[node]'¶
-
exc
¶ exception describing the problem
- Side effects of assign filters:
- type-checked (value must be of type Exception)
- constant (read-only after initialization)
-
field_list
= [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'text'>, <Field name:'exc'>]¶
-
lineno
¶ Line number (1-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
parse
(text: str, lineno: int=0, col_offset: int=0) → 'Re'¶ Parse a bit of text and return a concrete subclass of
Re
Parameters: text – The text to parse Returns: If text
is a correct regular expression then an instance ofReOk
is returned. In practice exactly one ofReFixed
orRePattern
may be returned. Iftext
is incorrect then an instance ofReErr
is returned.Examples:
>>> Re.parse("text") ReFixed(text='text')
>>> Re.parse("pa[tT]ern") RePattern(text='pa[tT]ern', re=re.compile('pa[tT]ern'))
>>> from sre_constants import error >>> Re.parse("+") ReErr(text='+', exc=error('nothing to repeat',))
-
text
¶ Text of the regular expression (perhaps invalid)
- Side effects of assign filters:
- type-checked (value must be of type str)
- constant (read-only after initialization)
-
visit
(visitor: 'Visitor')¶ Visit all of the sub-nodes reachable from this node
Parameters: visitor – Visitor object that gets to explore this and all the other nodes Returns: The return value of the visitor’s Visitor.visit()
method, if any. The default visitor doesn’t return anything.
-
-
class
plainbox.impl.xparsers.
ReFixed
(*args, **kwargs)[source]¶ Bases:
plainbox.impl.xparsers.ReOk
node representing a trivial regular expression (fixed string)
-
as_dict
() → dict¶ Return the data in this POD as a dictionary.
Note
UNSET values are not added to the dictionary.
-
as_tuple
() → tuple¶ Return the data in this POD as a tuple.
Order of elements in the tuple corresponds to the order of field declarations.
-
col_offset
¶ Column offset (0-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
enumerate_entries
() → 'Generator[node]'¶
-
field_list
= [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'text'>]¶
-
lineno
¶ Line number (1-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
parse
(text: str, lineno: int=0, col_offset: int=0) → 'Re'¶ Parse a bit of text and return a concrete subclass of
Re
Parameters: text – The text to parse Returns: If text
is a correct regular expression then an instance ofReOk
is returned. In practice exactly one ofReFixed
orRePattern
may be returned. Iftext
is incorrect then an instance ofReErr
is returned.Examples:
>>> Re.parse("text") ReFixed(text='text')
>>> Re.parse("pa[tT]ern") RePattern(text='pa[tT]ern', re=re.compile('pa[tT]ern'))
>>> from sre_constants import error >>> Re.parse("+") ReErr(text='+', exc=error('nothing to repeat',))
-
text
¶ Text of the regular expression (perhaps invalid)
- Side effects of assign filters:
- type-checked (value must be of type str)
- constant (read-only after initialization)
-
visit
(visitor: 'Visitor')¶ Visit all of the sub-nodes reachable from this node
Parameters: visitor – Visitor object that gets to explore this and all the other nodes Returns: The return value of the visitor’s Visitor.visit()
method, if any. The default visitor doesn’t return anything.
-
-
class
plainbox.impl.xparsers.
ReOk
(*args, **kwargs)[source]¶ Bases:
plainbox.impl.xparsers.Re
node representing a correct regular expression
-
as_dict
() → dict¶ Return the data in this POD as a dictionary.
Note
UNSET values are not added to the dictionary.
-
as_tuple
() → tuple¶ Return the data in this POD as a tuple.
Order of elements in the tuple corresponds to the order of field declarations.
-
col_offset
¶ Column offset (0-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
enumerate_entries
() → 'Generator[node]'¶
-
field_list
= [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'text'>]¶
-
lineno
¶ Line number (1-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
match
(text: str) → bool[source]¶ check if the given text matches the expression
This method is provided by all of the subclasses of
ReOk
, sometimes the implementation is faster than a naive regular expression match.>>> Re.parse("foo").match("foo") True
>>> Re.parse("foo").match("f") False
>>> Re.parse("[fF]oo").match("foo") True
>>> Re.parse("[fF]oo").match("Foo") True
-
parse
(text: str, lineno: int=0, col_offset: int=0) → 'Re'¶ Parse a bit of text and return a concrete subclass of
Re
Parameters: text – The text to parse Returns: If text
is a correct regular expression then an instance ofReOk
is returned. In practice exactly one ofReFixed
orRePattern
may be returned. Iftext
is incorrect then an instance ofReErr
is returned.Examples:
>>> Re.parse("text") ReFixed(text='text')
>>> Re.parse("pa[tT]ern") RePattern(text='pa[tT]ern', re=re.compile('pa[tT]ern'))
>>> from sre_constants import error >>> Re.parse("+") ReErr(text='+', exc=error('nothing to repeat',))
-
text
¶ Text of the regular expression (perhaps invalid)
- Side effects of assign filters:
- type-checked (value must be of type str)
- constant (read-only after initialization)
-
visit
(visitor: 'Visitor')¶ Visit all of the sub-nodes reachable from this node
Parameters: visitor – Visitor object that gets to explore this and all the other nodes Returns: The return value of the visitor’s Visitor.visit()
method, if any. The default visitor doesn’t return anything.
-
-
class
plainbox.impl.xparsers.
RePattern
(*args, **kwargs)[source]¶ Bases:
plainbox.impl.xparsers.ReOk
node representing a regular expression pattern
-
as_dict
() → dict¶ Return the data in this POD as a dictionary.
Note
UNSET values are not added to the dictionary.
-
as_tuple
() → tuple¶ Return the data in this POD as a tuple.
Order of elements in the tuple corresponds to the order of field declarations.
-
col_offset
¶ Column offset (0-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
enumerate_entries
() → 'Generator[node]'¶
-
field_list
= [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'text'>, <Field name:'re'>]¶
-
lineno
¶ Line number (1-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
parse
(text: str, lineno: int=0, col_offset: int=0) → 'Re'¶ Parse a bit of text and return a concrete subclass of
Re
Parameters: text – The text to parse Returns: If text
is a correct regular expression then an instance ofReOk
is returned. In practice exactly one ofReFixed
orRePattern
may be returned. Iftext
is incorrect then an instance ofReErr
is returned.Examples:
>>> Re.parse("text") ReFixed(text='text')
>>> Re.parse("pa[tT]ern") RePattern(text='pa[tT]ern', re=re.compile('pa[tT]ern'))
>>> from sre_constants import error >>> Re.parse("+") ReErr(text='+', exc=error('nothing to repeat',))
-
re
¶ regular expression object
- Side effects of assign filters:
- type-checked (value must be of type SRE_Pattern)
- constant (read-only after initialization)
-
text
¶ Text of the regular expression (perhaps invalid)
- Side effects of assign filters:
- type-checked (value must be of type str)
- constant (read-only after initialization)
-
visit
(visitor: 'Visitor')¶ Visit all of the sub-nodes reachable from this node
Parameters: visitor – Visitor object that gets to explore this and all the other nodes Returns: The return value of the visitor’s Visitor.visit()
method, if any. The default visitor doesn’t return anything.
-
-
class
plainbox.impl.xparsers.
Visitor
[source]¶ Bases:
object
Class assisting in traversing
Node
trees.This class can be used to explore the AST of any of the plainbox-parsed text formats. The way to use this method is to create a custom sub-class of the
Visitor
class and to define methods that correspond to the class of node one is interested in.Example: >>> class Text(Node): ... text = F(“text”, str)
>>> class Group(Node): ... items = F("items", list)
>>> class demo_visitor(Visitor): ... def visit_Text_node(self, node: Text): ... print("visiting text node: {}".format(node.text)) ... return self.generic_visit(node) ... def visit_Group_node(self, node: Group): ... print("visiting list node") ... return self.generic_visit(node)
>>> Group(items=[ ... Text(text="foo"), Text(text="bar") ... ]).visit(demo_visitor()) visiting list node visiting text node: foo visiting text node: bar
-
class
plainbox.impl.xparsers.
WhiteList
(*args, **kwargs)[source]¶ Bases:
plainbox.impl.xparsers.Node
node representing a whole plainbox whitelist
-
as_dict
() → dict¶ Return the data in this POD as a dictionary.
Note
UNSET values are not added to the dictionary.
-
as_tuple
() → tuple¶ Return the data in this POD as a tuple.
Order of elements in the tuple corresponds to the order of field declarations.
-
col_offset
¶ Column offset (0-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
entries
¶ a list of comments and patterns
- Side effects of assign filters:
- type-checked (value must be of type list)
- type-checked sequence (items must be of type Node)
- constant (read-only after initialization)
-
enumerate_entries
() → 'Generator[node]'¶
-
field_list
= [<Field name:'lineno'>, <Field name:'col_offset'>, <Field name:'entries'>]¶
-
lineno
¶ Line number (1-based)
- Side effects of assign filters:
- type-checked (value must be of type int)
- not negative
- constant (read-only after initialization)
-
static
parse
(text: str, lineno: int=1, col_offset: int=0) → 'WhiteList'[source]¶ Parse a plainbox whitelist
Empty string is still a valid (though empty) whitelist
>>> WhiteList.parse("") WhiteList(entries=[])
White space is irrelevant and gets ignored if it’s not of any semantic value. Since whitespace was never a part of the de-facto allowed pattern syntax one cannot create a job with ” ”.
>>> WhiteList.parse(" ") WhiteList(entries=[])
As soon as there’s something interesting though, it starts to have meaning. Note that we differentiate the raw text ‘ a ‘ from the pattern object is represents ‘^namespace::a$’ but at this time, when we parse the text this contextual, semantic information is not available and is not a part of the AST.
>>> WhiteList.parse(" data ") WhiteList(entries=[ReFixed(text=' data ')])
Data gets separated into line-based records. Any number of lines may exist in a single whitelist.
>>> WhiteList.parse("line") WhiteList(entries=[ReFixed(text='line')])
>>> WhiteList.parse("line 1\nline 2\n") WhiteList(entries=[ReFixed(text='line 1'), ReFixed(text='line 2')])
Empty lines are just ignored. You can re-create them by observing lack of continuity in the values of the
lineno
field.>>> WhiteList.parse("line 1\n\nline 3\n") WhiteList(entries=[ReFixed(text='line 1'), ReFixed(text='line 3')])
Data can be mixed with comments. Note that col_offset is finally non-zero here as the comments starts on the fourth character into the line:
>>> WhiteList.parse("foo # pick foo") ... WhiteList(entries=[ReFixed(text='foo '), Comment(comment='# pick foo')])
Comments can also exist without any data:
>>> WhiteList.parse("# this is a comment") WhiteList(entries=[Comment(comment='# this is a comment')])
Lastly, there are no exceptions at this stage, broken patterns are represented as such but no exceptions are ever raised:
>>> WhiteList.parse("[]") ... WhiteList(entries=[ReErr(text='[]', exc=error('un...',))])
-
visit
(visitor: 'Visitor')¶ Visit all of the sub-nodes reachable from this node
Parameters: visitor – Visitor object that gets to explore this and all the other nodes Returns: The return value of the visitor’s Visitor.visit()
method, if any. The default visitor doesn’t return anything.
-