datamodules.utils package


datamodules.utils.dataset_predict module

class DatasetPredict(image_path_list: List[str], image_dims: ImageDimensions, image_transform=None, target_transform=None, twin_transform=None)[source]

Bases: Dataset

Dataset class for the prediction of the test set. It takes a folder of images and creates the prediction of these images.

  • image_path_list (List[str]) – list of image paths

  • image_dims (ImageDimensions) – image dimensions

  • image_transform (Callable) – image transformation

  • target_transform (Callable) – target transformation

  • twin_transform (Callable) – twin transformation

static expend_glob_path_list(glob_path_list: List[str]) List[Path][source]

Expends the glob path list to a list of paths.


glob_path_list (List[str]) – list of glob paths


list of paths

Return type:


datamodules.utils.exceptions module

exception PathMissingDirinSplitDir[source]

Bases: Exception

Raised when a path is missing a directory in a split directory

exception PathMissingSplitDir[source]

Bases: Exception

Raised when a path is missing a split directory

exception PathNone[source]

Bases: Exception

Raised when a path is None

exception PathNotDir[source]

Bases: Exception

Raised when a path is not a directory

datamodules.utils.functional module

argmax_onehot(tensor: Tensor)[source]

Returns the argmax of a one-hot encoded tensor.


tensor (torch.Tensor) – The one-hot encoded tensor


The argmax of the one-hot encoded tensor

Return type:


datamodules.utils.image_analytics module

compute_mean_std(file_names: Union[ndarray, List[Path]], inmem: bool = False, workers: int = 8) Tuple[float, float][source]

Computes mean and std of all images present at target folder.

  • file_names (Union[np.ndarray[str], List[Path]]) – List of the file names of the images

  • inmem (bool) – Specifies whether is should be computed i nan online of offline fashion.

  • workers (int) – Number of workers to use for the mean/std computation


mean and std

Return type:

Tuple[float, float]

datamodules.utils.misc module

class ImageDimensions(width: int, height: int)[source]

Bases: object

Dataclass to store the dimensions of an image

  • width (int) – Width of the image

  • height (int) – Height of the image

height: int
width: int
check_missing_analytics(analytics_path_gt: Path, expected_keys_gt: List[str]) Tuple[Dict[str, Any], bool][source]

Check if the analytics file for the ground truth is missing and if it is complete. If its is present, it will be loaded and the contained keys checked for completeness.

  • analytics_path_gt (Path) – Path where the analytics file should be

  • expected_keys_gt (List[str]) – List of expected keys in the analytics file


Tuple of the loaded analytics and a boolean indicating if the analytics file is missing

Return type:

Tuple[Dict[str, Any], bool]

find_new_filename(filename: str, current_list: List[str]) str[source]

Finds a new filename that is not in the current list. If the filename is not in the list, it is returned. If the filename is in the list, a number is appended to the filename until it is unique.

  • filename (str) – Filename to check

  • current_list (List[str]) – List of filenames to check against


New filename that is not in the current list

Return type:


get_image_dims(data_gt_path_list) ImageDimensions[source]

Returns the image dimensions of the first image in the list.


data_gt_path_list (List[Path]) – List of image paths


Image dimensions of the first image in the list

Return type:


get_output_file_list(image_path_list: List[Path]) List[str][source]

Creates a list of output filenames from a list of image paths. If there are duplicate filenames, the duplicates are renamed to be unique.


image_path_list (List[Path]) – List of image paths


List of output filenames

Return type:


pil_loader_gif(path: Path) <module 'PIL.Image' from '/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/PIL/'>[source]

Loads a gif image using PIL.


path (Path) – Path to the image


Image loaded in palette mode

Return type:


save_json(analytics: Dict, analytics_path: Path)[source]

Saves the analytics dict to a json file.

  • analytics (Dict) – The analytics dict that should be saved

  • analytics_path (Path) – Path to the json file

selection_validation(files_in_data_root: List[Path], selection: Union[int, List[str], ListConfig], full_page: bool) Union[int, List[str], ListConfig][source]

Validates the selection parameter for the segmentation dataset. If selection is an integer, it is checked if it is in the range of the number of files. If selection is a list, it is checked if all elements are in the list of files. If selection is None, it is returned.

  • files_in_data_root (List[Path]) – List of files in the data root directory

  • selection (Union[int, List[str], ListConfig]) – Selection parameter

  • full_page (bool) – If True, the selection parameter is used to select a page. If False, the selection parameter is used to select a subdirectory.


Validated selection parameter

Return type:

Union[int, List[str], ListConfig]

validate_path_for_segmentation(data_dir: str, data_folder_name: str, gt_folder_name: str, split_name: Union[str, List[str]]) Path[source]

Checks if the data_dir folder has the following structure:

    ├── train_folder_name
    │   ├── data_folder_name
    │   └── gt_folder_name
    ├── val_folder_name
    │   ├── data_folder_name
    │   └── gt_folder_name
    └── test_folder_name
        ├── data_folder_name
        └── gt_folder_name
  • data_dir (str) – Path to the root dir of the dataset

  • data_folder_name (str) – Name of the data folder

  • gt_folder_name (str) – Name of the gt folder

  • split_name (str) – Name of the split folder (train/val/test)


Path to the data_dir

Return type:


datamodules.utils.output_tools module

merge_patches(patch: ndarray, coordinates: Tuple[int, int], full_output: ndarray) ndarray[source]

This function merges the patch into the full output image Overlapping values are resolved by taking the max.

  • patch (np.ndarray) – numpy matrix of size [#classes x crop_size x crop_size] a patch from the larger image

  • coordinates (Tuple[int, int]) – tuple of ints top left coordinates of the patch within the larger image for all patches in a batch

  • full_output (np.ndarray) – numpy matrix of size [#C x H x W] output image at full size


full_output: numpy matrix [#C x Htot x Wtot] output image at full size with patch inserted

Return type:


datamodules.utils.single_transforms module

class MorphoBuilding(first_filter_size: Tuple[int, int], second_filter_size: Tuple[int, int], border_cut_horizontal: Optional[int] = None, border_cut_vertical: Optional[int] = None)[source]

Bases: object

Applies the idea of morphological operators to build the GT base on the paper Historical document image analysis using controlled data for pre-training. It takes an PIL.Image extracts the blue color channel and binarizes it with the Otsu method. On this image we cut away the border and use twice a closing followed by an opening operation onto the image to create two binary images. These images are then used as the red and green channel of a new image where the blue channel contains zeros.

  • first_filter_size (Tuple[int, int]) – The size of the first filter

  • second_filter_size (Tuple[int, int]) – The size of the second filter

  • border_cut_horizontal (int) – Pixel to remove on top and bottom

  • border_cut_vertical (int) – Pixel to removeleft and right

class OneHotToPixelLabelling[source]

Bases: object

Transforms a one-hot encoded tensor to a pixel labelling tensor.


tensor (torch.Tensor) – The one-hot encoded tensor


The pixel labelling tensor

Return type:


class RightAngleRotation(angle_list=None)[source]

Bases: object

Rotates the input tensor by a random angle from the list of angles. To also get the class if this is used in a gt generation context, the class is accessible via .target_class.


angle_list (List[int]) – The list of angles to choose from

class TilesBuilding(rows: int, cols: int, fixed_positions: int = 0, width_center_crop: int = 840, height_center_crop: int = 1200)[source]

Bases: object

Applies the idea of an embedded jigsaw puzzle on an image. The image is divided into a grid of tiles and then the tiles are shuffled. The number of rows and columns of the grid, the number of fixed positions and the size of the center crop can be set. The number of fixed positions must be less than the number of tiles - 1 and less than the number of tiles. To also get the class if this is used in a gt generation context, the class is accessible via .target_class.

  • rows (int) – The number of rows of the grid

  • cols (int) – The number of columns of the grid

  • fixed_positions (int) – The number of fixed positions in the grid

  • width_center_crop (int) – The width of the center crop

  • height_center_crop – The height of the center crop

datamodules.utils.twin_transforms module

class ToTensorSlidingWindowCrop(crop_size: int)[source]

Bases: object

Crop the data and ground truth image at the specified coordinates to the specified size and convert them to a tensor.


crop_size (int) – Size of the crop.

class TwinCompose(transforms: List[Callable])[source]

Bases: object

Composes several transforms together and applies it to both, the codex image and the ground truth.


transforms – List of transforms to compose.

class TwinImageToTensor[source]

Bases: object

Convert a PIL Image or numpy.ndarray to tensor. Converts a PIL Image or numpy.ndarray (W x H x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0].

  • img (PIL Image or numpy.ndarray) – Image to be converted to tensor.

  • gt (PIL Image or numpy.ndarray) – Image to be converted to tensor.


Converted image.

Return type:

Tuple[Tensor, Tensor]

class TwinRandomCrop(crop_size: int)[source]

Bases: object

Crop the given PIL Images at the same random location


crop_size (int) – Desired output size of the crop.

get_params(img_size: Tuple[int, int]) Tuple[int, int, int, int][source]

Get parameters for crop for a random crop


img_size (Tuple[int, int]) – Image size (h, w)


params (i, j, h, w) to be passed to crop for random crop.

Return type:

Tuple[int, int, int, int]

datamodules.utils.wrapper_transforms module

class OnlyImage(transform: Callable)[source]

Bases: object

Wrapper function around a single parameter transform. It will be cast only on image


transform (Callable) – Transformation to apply to the codex image

class OnlyTarget(transform: Callable)[source]

Bases: object

Wrapper function around a single parameter transform. It will be cast only on target


transform (Callable) – Transformation to apply to the ground truth image

Module contents