GTRefiner.BuildingTools.Visitors package¶
Submodules¶
GTRefiner.BuildingTools.Visitors.Colorer module¶
- class GTRefiner.BuildingTools.Visitors.Colorer.Colorer(color_table: Optional[ColorTable] = None)¶
Bases:
Visitor
Default implementation of Colorer sets the colors of the vector objects and the different levels of the pixel ground truth. Using a color table in c{JSON} format, a client can assign individual colors to the various vector objects of the VectorGT or layers of the PixelGT. The colorer takes the responsibility of reading this information and assigning it to the target elements. It supports three strategies: paint each TextElement of a layout class the same, paint each TextElement of a layout class differently, or give each a different color. In principle, any other coloring strategy is conceivable. To do this, the colors in the Color palette must be selected accordingly. If specifics groups should have a certain color (or iteration of colors), a new Colorer can be written, to introduce the desired rules. :param color_table: colors of color table are used to define the colors of the differenet page elements
PageElement
, layoutsLayout
and regions,Region
. :type color_table: ColorTable- visit_page(page: Page)¶
Set the colors of the ground truth information from page. Supported strategies are alternating, all the same, all different or any other combination of colors given in the color table. This Colorer just iterates over the colors given. :param page: Is going to be colored according to the color table from the instance of self :type page: Page
GTRefiner.BuildingTools.Visitors.Cropper module¶
- class GTRefiner.BuildingTools.Visitors.Cropper.Cropper(target_dim: ImageDimension)¶
Bases:
Visitor
The Cropping Visitor crops the image to a desired dimension. It is possible to specify whether the page is left- or right-bound. The current use of the cropping function is to get rid of useless edge pixels, but it could also be used for illustration purposes or targeted cropping of text regions. The behavior is inherited through the Croppable interface.
GTRefiner.BuildingTools.Visitors.Grouper module¶
- class GTRefiner.BuildingTools.Visitors.Grouper.BlockGrouper(layout_class: LayoutClasses)¶
Bases:
Grouper
Groups the elements of the :class: Layout given into different blocks depending on their minimal x-coordinate. Takes use of the np.histogram to group into different bins. :param layout_class: Layout-class to be grouped. :type layout_class: LayoutClasses
- group(region: TextRegion, bins: int = 6) TextRegion ¶
Detect clusters based on x value of page elements. :param region: region to be sorted. :type region: TextRegion :param bins: number of x-oriented bins, defaults to 6 :type bins: int :return: returns the given amount of bins (or less) see numpy documentation :rtype: List[PageElement]
- class GTRefiner.BuildingTools.Visitors.Grouper.Grouper¶
Bases:
Visitor
The Grouping tool works at the level of text regions. It creates, divides and combines the text elements into logical (sub)groups depending on the algorithm. Currently, the Grouper module supports clustering text elements based on their smallest x-coordinate (Blockgrouper) and subdividing them into blocks of close, adjacent text elements (Textgrouper).
- abstract group(region: TextRegion) TextRegion ¶
Groups all layouts class: Layout within a region class: TextRegion. :param region: Region to be (re-)grouped :type region: TextRegion :return: grouped Region :rtype: TextRegion
- class GTRefiner.BuildingTools.Visitors.Grouper.ThresholdGrouper(x_threshold: int, y_threshold: int, layout_class: LayoutClasses)¶
Bases:
Grouper
Threshold grouper splits regions if their elements are too far apart (only if both the x and y threshold are exceeded). :param x_threshold: x threshold in pixels to define determin if an element should be split off. :type x_threshold: int :param y_threshold: y threshold in pixels to define determin if an element should be split off. :type y_threshold: int :param layout_class: Layout-class to be grouped. :type layout_class: LayoutClasses
- group(region: TextRegion) TextRegion ¶
Splits regions if their elements are too far apart (only if both the x and y threshold are exceeded). Warning: Make sure regions are sorted in either ascending or descending order the way a text is read. :param region: region to be grouped :type region: TextRegion
GTRefiner.BuildingTools.Visitors.IllustratorVisitor module¶
- class GTRefiner.BuildingTools.Visitors.IllustratorVisitor.Illustrator(background: <module 'PIL.Image' from '/opt/miniconda3/envs/BachelorThesis/lib/python3.8/site-packages/PIL/Image.py'> = None, color_table: ~GTRefiner.GTRepresentation.Table.ColorTable = None, vis_table: ~GTRefiner.GTRepresentation.Table.VisibilityTable = None, outline: ~typing.Tuple = None)¶
Bases:
Visitor
Illustrator serves for visualizing processes. :param background: If you want the vector gt to be drawn on a background, specify the image :type background: Image :param color_table: If you want another color table than the quick and dirty specified in this class :type color_table: ColorTable :param vis_table: If you want another vis table than the quick and dirty specified in this class :type vis_table: VisibilityTable :param outline: Specify the outline color here, if None is given no outline will be drawn. :type outline: tuple
- color_table = <GTRefiner.GTRepresentation.Table.ColorTable object>¶
- comment_color = [(255, 20, 255)]¶
- decoration_color = [(20, 255, 255)]¶
- main_text_color = [(10, 20, 255)]¶
- vis_table = <GTRefiner.GTRepresentation.Table.VisibilityTable object>¶
GTRefiner.BuildingTools.Visitors.Layerer module¶
- class GTRefiner.BuildingTools.Visitors.Layerer.Layerer¶
Bases:
Visitor
The Layerer Visitor is used to combine the two ground truths (vector gt and pixel-based gt). It paints the desired vector objects of the vector GT onto the layers of the pixel-level GT and combines them to form an RGB image. In doing so, it overlays the vector GT as a binary image on top of the combined layers of the pixel GT, keeping only pixels that are visible in both the vector GT and the pixel-level GT.
GTRefiner.BuildingTools.Visitors.Resizer module¶
- class GTRefiner.BuildingTools.Visitors.Resizer.Resizer(target_dim: ImageDimension)¶
Bases:
Visitor
Resize a page (and all it’s ground-truth information, including the original image) to a target dimension. The default implementation scales the PixelGT in four steps. As in the last presented strategy, first all relevant text pixels are set as visible True and all others as invisible False. In the next step, the image is blurred using Gaussian methods - the ground truth image is now in grayscale. Finally, the blurred image is bicubically interpolated and binarized again (according to Otsu, Niblack or Sauvola). Blurring leads to a thickening of the text elements. The more blurring is applied, the more the text elements merge into each other. :param target_dim: Target dimension :type target_dim: ImageDimension
GTRefiner.BuildingTools.Visitors.Sorter module¶
- class GTRefiner.BuildingTools.Visitors.Sorter.Sorter¶
Bases:
Visitor
Sort a given container of objects. The text elements of the vectorized ground truth of the DIVA-HisDB are not consistently sorted, which is why the Sorter should always be used if the order of the text elements matters. We implement this function by having the layout and TextRegion classes both override __lt__() base-function of the Python object. Thus they provide an interface for efficient sorting (thanks to Python’s built-in sorting algorithms) of text elements and regions. The sorter tool can be used to invoke, add to, and modify this behavior as desired. The sorter goes hand in hand with the grouper tool, see module Grouper, and the alternating colorer, see module Colorer.
GTRefiner.BuildingTools.Visitors.TextLineDecorator module¶
- class GTRefiner.BuildingTools.Visitors.TextLineDecorator.AscenderDescenderDecorator(x_height: int)¶
Bases:
TextLineDecorator
- Parameters:
x_height (int) – Based on this int value and a baseline provided by the TextLine element calculate Ascenders, Descenders and x-Height (Rectangles).
- class GTRefiner.BuildingTools.Visitors.TextLineDecorator.HeadAndTailDecorator¶
Bases:
TextLineDecorator
“Example of another Decorator Class”
- class GTRefiner.BuildingTools.Visitors.TextLineDecorator.HistogramDecorator¶
Bases:
TextLineDecorator
“Example of another Decorator Class”
GTRefiner.BuildingTools.Visitors.VisibilityVisitor module¶
- class GTRefiner.BuildingTools.Visitors.VisibilityVisitor.VisibilityVisitor(vis_table: Optional[VisibilityTable] = None)¶
Bases:
Visitor
Based on a visibility table, set all elements in the vector ground-truth
VectorGT
and all layers of the pixel level ground-truth:class:PixelLevelGT to the specified boolean value. Analogous to the Colorer, the Visibility-Visitor reads a visibility table that defines whether a layout class should be visible or not. If the user decides that only individual text regions or text elements are of interest, a new visitor can be written that implements the desired functionality. :param vis_table: visibility table. :type vis_table: VisibilityTable- visit_page(page: Page)¶
Based on a visibility table, set all elements in the vector ground-truth
VectorGT
and all layers of the pixel level ground-truth:class:PixelLevelGT to the specified boolean value. :param page: Page that should be set visible according to the visibility table provided within the page :class: Page or can be set Visible with a custom visibility table provided by the instance (at instantiation). :type page: Page