datamodules.utils package

Submodules

datamodules.utils.dataset_predict module

class DatasetPredict(image_path_list: List[str], image_dims: ImageDimensions, image_transform=None, target_transform=None, twin_transform=None)[source]

Bases: Dataset

Dataset class for the prediction of the test set. It takes a folder of images and creates the prediction of these images.

Parameters:

image_path_list (List[str]) – list of image paths
image_dims (ImageDimensions) – image dimensions
image_transform (Callable) – image transformation
target_transform (Callable) – target transformation
twin_transform (Callable) – twin transformation

static expend_glob_path_list(glob_path_list: List[str]) → List[Path][source]

Expends the glob path list to a list of paths.

Parameters:: glob_path_list (List[str]) – list of glob paths
Returns:: list of paths
Return type:: List[Path]

datamodules.utils.exceptions module

exception PathMissingDirinSplitDir[source]

Bases: Exception

Raised when a path is missing a directory in a split directory

exception PathMissingSplitDir[source]

Bases: Exception

Raised when a path is missing a split directory

exception PathNone[source]

Bases: Exception

Raised when a path is None

exception PathNotDir[source]

Bases: Exception

Raised when a path is not a directory

datamodules.utils.functional module

argmax_onehot(tensor: Tensor)[source]

Returns the argmax of a one-hot encoded tensor.

Parameters:: tensor (torch.Tensor) – The one-hot encoded tensor
Returns:: The argmax of the one-hot encoded tensor
Return type:: torch.Tensor

datamodules.utils.image_analytics module

compute_mean_std(file_names: Union[ndarray, List[Path]], inmem: bool = False, workers: int = 8) → Tuple[float, float][source]

Computes mean and std of all images present at target folder.

Parameters:

file_names (Union[np.ndarray[str], List[Path]]) – List of the file names of the images
inmem (bool) – Specifies whether is should be computed i nan online of offline fashion.
workers (int) – Number of workers to use for the mean/std computation

Returns:

mean and std

Return type:

Tuple[float, float]

datamodules.utils.misc module

class ImageDimensions(width: int, height: int)[source]

Bases: object

Dataclass to store the dimensions of an image

Parameters:

width (int) – Width of the image
height (int) – Height of the image

height: int

width: int

check_missing_analytics(analytics_path_gt: Path, expected_keys_gt: List[str]) → Tuple[Dict[str, Any], bool][source]

Check if the analytics file for the ground truth is missing and if it is complete. If its is present, it will be loaded and the contained keys checked for completeness.

Parameters:

analytics_path_gt (Path) – Path where the analytics file should be
expected_keys_gt (List[str]) – List of expected keys in the analytics file

Returns:

Tuple of the loaded analytics and a boolean indicating if the analytics file is missing

Return type:

Tuple[Dict[str, Any], bool]

find_new_filename(filename: str, current_list: List[str]) → str[source]

Finds a new filename that is not in the current list. If the filename is not in the list, it is returned. If the filename is in the list, a number is appended to the filename until it is unique.

Parameters:

filename (str) – Filename to check
current_list (List[str]) – List of filenames to check against

Returns:

New filename that is not in the current list

Return type:

str

get_image_dims(data_gt_path_list) → ImageDimensions[source]

Returns the image dimensions of the first image in the list.

Parameters:: data_gt_path_list (List[Path]) – List of image paths
Returns:: Image dimensions of the first image in the list
Return type:: ImageDimensions

get_output_file_list(image_path_list: List[Path]) → List[str][source]

Creates a list of output filenames from a list of image paths. If there are duplicate filenames, the duplicates are renamed to be unique.

Parameters:: image_path_list (List[Path]) – List of image paths
Returns:: List of output filenames
Return type:: List[str]

pil_loader_gif(path: Path) → <module 'PIL.Image' from '/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/PIL/Image.py'>[source]

Loads a gif image using PIL.

Parameters:: path (Path) – Path to the image
Returns:: Image loaded in palette mode
Return type:: Image

save_json(analytics: Dict, analytics_path: Path)[source]

Saves the analytics dict to a json file.

Parameters:

analytics (Dict) – The analytics dict that should be saved
analytics_path (Path) – Path to the json file

selection_validation(files_in_data_root: List[Path], selection: Union[int, List[str], ListConfig], full_page: bool) → Union[int, List[str], ListConfig][source]

Validates the selection parameter for the segmentation dataset. If selection is an integer, it is checked if it is in the range of the number of files. If selection is a list, it is checked if all elements are in the list of files. If selection is None, it is returned.

Parameters:

files_in_data_root (List[Path]) – List of files in the data root directory
selection (Union[int, List[str], ListConfig]) – Selection parameter
full_page (bool) – If True, the selection parameter is used to select a page. If False, the selection parameter is used to select a subdirectory.

Returns:

Validated selection parameter

Return type:

Union[int, List[str], ListConfig]

validate_path_for_segmentation(data_dir: str, data_folder_name: str, gt_folder_name: str, split_name: Union[str, List[str]]) → Path[source]

Checks if the data_dir folder has the following structure:

data_dir
    ├── train_folder_name
    │   ├── data_folder_name
    │   └── gt_folder_name
    ├── val_folder_name
    │   ├── data_folder_name
    │   └── gt_folder_name
    └── test_folder_name
        ├── data_folder_name
        └── gt_folder_name

Parameters:

data_dir (str) – Path to the root dir of the dataset
data_folder_name (str) – Name of the data folder
gt_folder_name (str) – Name of the gt folder
split_name (str) – Name of the split folder (train/val/test)

Returns:

Path to the data_dir

Return type:

Path

datamodules.utils.output_tools module

merge_patches(patch: ndarray, coordinates: Tuple[int, int], full_output: ndarray) → ndarray[source]

This function merges the patch into the full output image Overlapping values are resolved by taking the max.

Parameters:

patch (np.ndarray) – numpy matrix of size [#classes x crop_size x crop_size] a patch from the larger image
coordinates (Tuple[int, int]) – tuple of ints top left coordinates of the patch within the larger image for all patches in a batch
full_output (np.ndarray) – numpy matrix of size [#C x H x W] output image at full size

Returns:

full_output: numpy matrix [#C x Htot x Wtot] output image at full size with patch inserted

Return type:

np.ndarray

datamodules.utils.single_transforms module

class MorphoBuilding(first_filter_size: Tuple[int, int], second_filter_size: Tuple[int, int], border_cut_horizontal: Optional[int] = None, border_cut_vertical: Optional[int] = None)[source]

Bases: object

Applies the idea of morphological operators to build the GT base on the paper Historical document image analysis using controlled data for pre-training. It takes an PIL.Image extracts the blue color channel and binarizes it with the Otsu method. On this image we cut away the border and use twice a closing followed by an opening operation onto the image to create two binary images. These images are then used as the red and green channel of a new image where the blue channel contains zeros.

Parameters:

first_filter_size (Tuple[int, int]) – The size of the first filter
second_filter_size (Tuple[int, int]) – The size of the second filter
border_cut_horizontal (int) – Pixel to remove on top and bottom
border_cut_vertical (int) – Pixel to removeleft and right

class OneHotToPixelLabelling[source]

Bases: object

Transforms a one-hot encoded tensor to a pixel labelling tensor.

Parameters:: tensor (torch.Tensor) – The one-hot encoded tensor
Returns:: The pixel labelling tensor
Return type:: torch.Tensor

class RightAngleRotation(angle_list=None)[source]

Bases: object

Rotates the input tensor by a random angle from the list of angles. To also get the class if this is used in a gt generation context, the class is accessible via .target_class.

Parameters:: angle_list (List[int]) – The list of angles to choose from

class TilesBuilding(rows: int, cols: int, fixed_positions: int = 0, width_center_crop: int = 840, height_center_crop: int = 1200)[source]

Bases: object

Applies the idea of an embedded jigsaw puzzle on an image. The image is divided into a grid of tiles and then the tiles are shuffled. The number of rows and columns of the grid, the number of fixed positions and the size of the center crop can be set. The number of fixed positions must be less than the number of tiles - 1 and less than the number of tiles. To also get the class if this is used in a gt generation context, the class is accessible via .target_class.

Parameters:

rows (int) – The number of rows of the grid
cols (int) – The number of columns of the grid
fixed_positions (int) – The number of fixed positions in the grid
width_center_crop (int) – The width of the center crop
height_center_crop – The height of the center crop

datamodules.utils.twin_transforms module

class ToTensorSlidingWindowCrop(crop_size: int)[source]

Bases: object

Crop the data and ground truth image at the specified coordinates to the specified size and convert them to a tensor.

Parameters:: crop_size (int) – Size of the crop.

class TwinCompose(transforms: List[Callable])[source]

Bases: object

Composes several transforms together and applies it to both, the codex image and the ground truth.

Parameters:: transforms – List of transforms to compose.

class TwinImageToTensor[source]

Bases: object

Convert a PIL Image or numpy.ndarray to tensor. Converts a PIL Image or numpy.ndarray (W x H x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0].

Parameters:

img (PIL Image or numpy.ndarray) – Image to be converted to tensor.
gt (PIL Image or numpy.ndarray) – Image to be converted to tensor.

Returns:

Converted image.

Return type:

Tuple[Tensor, Tensor]

class TwinRandomCrop(crop_size: int)[source]

Bases: object

Crop the given PIL Images at the same random location

Parameters:: crop_size (int) – Desired output size of the crop.

get_params(img_size: Tuple[int, int]) → Tuple[int, int, int, int][source]

Get parameters for crop for a random crop

Parameters:: img_size (Tuple[int, int]) – Image size (h, w)
Returns:: params (i, j, h, w) to be passed to crop for random crop.
Return type:: Tuple[int, int, int, int]

datamodules.utils.wrapper_transforms module

class OnlyImage(transform: Callable)[source]

Bases: object

Wrapper function around a single parameter transform. It will be cast only on image

Parameters:: transform (Callable) – Transformation to apply to the codex image

class OnlyTarget(transform: Callable)[source]

Bases: object

Wrapper function around a single parameter transform. It will be cast only on target

Parameters:: transform (Callable) – Transformation to apply to the ground truth image

datamodules.utils package

Submodules

datamodules.utils.dataset_predict module

datamodules.utils.exceptions module

datamodules.utils.functional module

datamodules.utils.image_analytics module

datamodules.utils.misc module

datamodules.utils.output_tools module

datamodules.utils.single_transforms module

datamodules.utils.twin_transforms module

datamodules.utils.wrapper_transforms module

Module contents