module

pipen_report.preprocess

Provides preprocess

Attributes
  • TAG_ATTR_RE ,re.VERBOSE | re.DOTALL )
    def _preprocess_slash_h( source: str, index: int, page: int, kind: str, text: str | None = None, ) -> Tuple[str, Mapping[str, Any]]: """Preprocess headings (h1 or h2 tag) adding anchor links
    Add an anchor link after the tag and produce the toc dict
    For example, if the source is <h1>Title 1</h1>, the output will be <h1>Title 1</h1><a id="prt-h1-1-title-1" class="pipen-report-toc-anchor"> </a>
    Args: text: The string repr of the tag (e.g <h1>Title 1</h1>) index: The index of this kind of heading in the document page: Which page are we on? kind: h1 or h2
Functions
  • preprocess(text, basedir, toc_switch, paging, relpath_tags) (list of str, list of ) Preprocess the rendered report and return the toc dict</>
function

pipen_report.preprocess.preprocess(text, basedir, toc_switch, paging, relpath_tags=None)

Preprocess the rendered report and return the toc dict

This is not only faster than using a xml/html parsing library but also more compatible with JSX, as most python xml/html parser cannot handle JSX

We use h1 and h2 tags to form TOCs. h1 and h2 tags have to be at the top level, which means you should not wrap them with any container in your svelte report template.

h1 tag should be the first tag in the document after </script>. Otherwise those non-h1 tags will appear in all pages and the relative paths won't be parsed.

Parameters
  • text (str) The rendered report
  • basedir (Path) The base directory
  • toc_switch (bool) Whether render a TOC?
  • paging (bool or int) Number of h1's in a pageFalse to disable
Returns (list of str, list of )

The preprocessed text and the toc dict