Skip to content

regexr.string

module

regexr.string

Regular expressions for humans

Predefined patterns:

START = Raw("^", entire=True)
START_OF_STRING = Raw(r"\A", entire=True)
END = Raw("$", entire=True)
END_OF_STRING = Raw(r"\Z", entire=True)
NUMBER = DIGIT = Raw(r"\d", entire=True)
NUMBERS = DIGITS = Raw(r"\d+", entire=True)
MAYBE_NUMBERS = MAYBE_DIGITS = Raw(r"\d*", entire=True)
NON_NUMBER = NON_DIGIT = Raw(r"\D", entire=True)
WORD = Raw(r"\w", entire=True)
WORDS = Raw(r"\w+", entire=True)
MAYBE_WORDS = Raw(r"\w*", entire=True)
NON_WORD = Raw(r"\W", entire=True)
WORD_BOUNDARY = Raw(r"\b", entire=True)
NON_WORD_BOUNDARY = Raw(r"\B", entire=True)
WHITESPACE = Raw(r"\s", entire=True)
WHITESPACES = Raw(r"\s+", entire=True)
MAYBE_WHITESPACES = Raw(r"\s*", entire=True)
NON_WHITESPACE = Raw(r"\S", entire=True)
SPACE = Raw(" ", entire=True)
SPACES = Raw(" +", entire=True)
MAYBE_SPACES = Raw(" *", entire=True)
TAB = Raw(r"\t", entire=True)
DOT = Raw(r"\.", entire=True)
ANYCHAR = Raw(".", entire=True)
ANYCHARS = Raw(".+", entire=True)
MAYBE_ANYCHARS = Raw(".*", entire=True)
LETTER = Raw("[a-zA-Z]", entire=True)
LETTERS = Raw("[a-zA-Z]+", entire=True)
MAYBE_LETTERS = Raw("[a-zA-Z]*", entire=True)
LOWERCASE = Raw("[a-z]", entire=True)
LOWERCASES = Raw("[a-z]+", entire=True)
MAYBE_LOWERCASES = Raw("[a-z]*", entire=True)
UPPERCASE = Raw("[A-Z]", entire=True)
UPPERCASES = Raw("[A-Z]+", entire=True)
MAYBE_UPPERCASES = Raw("[A-Z]*", entire=True)
ALNUM = Raw("[a-zA-Z0-9]", entire=True)
ALNUMS = Raw("[a-zA-Z0-9]+", entire=True)
MAYBE_ALNUMS = Raw("[a-zA-Z0-9]*", entire=True)
Classes
class

regexr.string.Segment(*args, capture=False, flags=None, deflags=None)

Segments of a regular expression

ClassVars: NONCAPTURING_WRAPPING: Whether we should wrap the segment with brackets when capture is False. In some cases, for example, (abc)+ is already an entire group, it won't confuse the parser when it comes with other segments, such as (abc)+d. We don't need an extra brackets to separate it from other segments. However, we need brackets for other segments, such as a|b|c, because a|b|cd will confuse the parser. In such a case, we need (?:a|b|c)d if we don't need to capture the segment.

Parameters
  • capture (bool | str, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
  • args Another segments to be wrapped by this one.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.CharClass(*args, capture=False, flags=None, deflags=None)

Used to indicat a set of characters wrapped by []

Parameters
  • capture (bool | str, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.OneOfChars(*args, capture=False, flags=None, deflags=None)

Positive character set [...]

Parameters
  • capture (bool | str, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.NoneOfChars(*args, capture=False, flags=None, deflags=None)

Negative character set [^...]

Parameters
  • capture (bool | str, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.Look(*args, capture=False, flags=None, deflags=None)

Look ahead or behind

Parameters
  • capture (bool | str, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.LookAhead(*args, capture=False, flags=None, deflags=None)

Look ahead (?=...)

Parameters
  • capture (bool | str, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.LookBehind(*args, capture=False, flags=None, deflags=None)

Look behind (?<=...)

Parameters
  • capture (bool | str, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.LookAheadNot(*args, capture=False, flags=None, deflags=None)

Look ahead not (?!...)

Parameters
  • capture (bool | str, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.LookBehindNot(*args, capture=False, flags=None, deflags=None)

Look behind not (?<!...)

Parameters
  • capture (bool | str, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

abstract class

regexr.string.Quantifier(*args, lazy=False, capture=False, flags=None, deflags=None)

Quantifier +, *, ?, {m} or {m,n}

Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.ZeroOrMore(*args, lazy=False, capture=False, flags=None, deflags=None)

* zero or more times

Parameters
  • capture (bool, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.OneOrMore(*args, lazy=False, capture=False, flags=None, deflags=None)

+ one or more times

Parameters
  • capture (bool, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.Maybe(*args, lazy=False, capture=False, flags=None, deflags=None)

? zero or one times

Parameters
  • capture (bool, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.Repeat(*args, m, n=None, lazy=False, capture=False, flags=None, deflags=None)

Match from m to n repetitions {m,n} or {m,}

Parameters
  • capture (bool, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.RepeatExact(*args, m, lazy=False, capture=False, flags=None, deflags=None)

Match exact m repetitions {m}

Parameters
  • capture (bool, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.Lazy(*args, capture=False, flags=None, deflags=None)

Non-greedy modifier +?, *?, ??, {m,}? or {m,n}?

Parameters
  • capture (bool, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.Flag(*args)

Flag (?aiLmsux)

Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.InlineFlag(*args, capture=False, flags=None, deflags=None)

Inline flag (?aiLmsux-imsx:...)

Parameters
  • capture (bool | str, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.Raw(*args, capture=False, entire=False, flags=None, deflags=None)

Raw strings without escaping

Parameters
  • capture (bool, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.Or(*args, capture=False, flags=None, deflags=None)

| connected segments

Parameters
  • capture (bool | str, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.Capture(*args, name=None, capture=None, flags=None, deflags=None)

Capture a match (...)

Parameters
  • capture (bool | str, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.NonCapture(*args, flags=None, deflags=None)

Non-capturing grouping (?:...)

Parameters
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.Concat(*args, capture=False, flags=None, deflags=None)

Concatenate segments

Parameters
  • capture (bool | str, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.Conditional(id_or_name, yes, no=None, capture=False, flags=None, deflags=None)

(?(...)yes|no) conditional pattern

Parameters
  • capture (bool, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.Captured(id_or_name, capture=False, flags=None, deflags=None)

(?P=name) captured group or \1, \2, ...

Parameters
  • capture (bool, optional) The name of the capture, False to disable capturing andTrue to capture without name.
  • flags (int | str | Sequence[int | str], optional) The flags to be used when compiling this segment.
  • deflags (int | str | Sequence[int | str], optional) Remove the flags from re.compile() while compilingthis segment.
Methods
  • __str__() (str) String representation of this segment, depending oncapture </>
  • pretty(indent, level) (str) Pretty print this segment, depending on capture</>
method

pretty(indent, level) → str

Pretty print this segment, depending on capture

Parameters
  • indent (str) The indent string.
  • level (int) The indent level.
method

__str__()

String representation of this segment, depending oncapture

Returns (str)

The final string representation of this segment.

class

regexr.string.Regexr(*segments)Regexr

Bases
str

The entrance of the package to compose a regular expression

It is actually a subclass of str, but with an extra method compile, which compiles the regular expression and returns a re.Pattern object.

Parameters
  • *segments (Segment or str) The segments of the regular expression.When composing the regular expression, the segments are concatenated
Methods
  • __repr__() (str) String representation of this regular expression</>
  • compile(flags) (Pattern) Compile the regular expression and return a re.Pattern object</>
  • pretty(indent) (str) Pretty print the regular expression</>
method

compile(flags=0) → Pattern

Compile the regular expression and return a re.Pattern object

See also re.compile()

Parameters
  • flags (int, optional) The flags to be used when compiling the regular expression.
method

pretty(indent=' ') → str

Pretty print the regular expression

method

__repr__() → str

String representation of this regular expression