ml.tasks.datasets.transforms

Defines a bunch of dataset transforms.

ml.tasks.datasets.transforms.square_crop(img: Image) Image[source]

Crops an image to a square.

Parameters:

img – The input image

Returns:

The cropped image, with height and width equal.

ml.tasks.datasets.transforms.square_resize_crop(img: Image, size: int, interpolation: InterpolationMode = InterpolationMode.NEAREST) Image[source]

Resizes an image to a square and then crops it.

Parameters:
  • img – The input image

  • size – The size of the square

  • interpolation – The interpolation mode to use

Returns:

The cropped image

ml.tasks.datasets.transforms.upper_left_crop(img: Image, height: int, width: int) Image[source]

Crops an image from the upper left corner.

This is useful because it preserves camera intrinsics for an image.

Parameters:
  • img – The input image

  • height – The height of the crop

  • width – The width of the crop

Returns:

The cropped image

ml.tasks.datasets.transforms.normalize(t: Tensor, *, mean: tuple[float, float, float] = (0.48145466, 0.4578275, 0.40821073), std: tuple[float, float, float] = (0.26862954, 0.26130258, 0.27577711)) Tensor[source]

Normalizes an image tensor (by default, using ImageNet parameters).

This can be paired with denormalize() to convert an image tensor to a normalized tensor for processing by a model.

Parameters:
  • t – The input tensor

  • mean – The mean to subtract

  • std – The standard deviation to divide by

Returns:

The normalized tensor

ml.tasks.datasets.transforms.denormalize(t: Tensor, *, mean: tuple[float, float, float] = (0.48145466, 0.4578275, 0.40821073), std: tuple[float, float, float] = (0.26862954, 0.26130258, 0.27577711)) Tensor[source]

Denormalizes a tensor.

This can be paired with normalize() to convert a normalized tensor back to the original image for viewing by humans.

Parameters:
  • t – The input tensor

  • mean – The mean to subtract

  • std – The standard deviation to divide by

Returns:

The denormalized tensor

ml.tasks.datasets.transforms.random_square_crop(img: Image) Image[source]

Randomly crops an image to a square.

Parameters:

img – The input image

Returns:

The cropped image

ml.tasks.datasets.transforms.random_square_crop_multi(imgs: list[Image]) list[Image][source]

Randomly crops a list of images to the same size.

Parameters:

imgs – The list of images to crop

Returns:

The cropped images

ml.tasks.datasets.transforms.make_size(img: Image, ref_size: tuple[int, int]) Image[source]

Converts an image to a specific size, zero-padding smaller dimension.

Parameters:
  • img – The input image

  • ref_size – The reference size, as (width, height)

Returns:

The resized image

ml.tasks.datasets.transforms.make_same_size(img: Image, ref_img: Image) Image[source]

Converts an image to the same size as a reference image.

Parameters:
  • img – The input image

  • ref_img – The reference image

Returns:

The input image resized to the same size as the reference image,

zero-padding dimensions which are too small

class ml.tasks.datasets.transforms.SquareResizeCrop(size: int, interpolation: InterpolationMode = InterpolationMode.NEAREST)[source]

Bases: Module

Resizes and crops an image to a square with the target shape.

Generally SquareCrop followed by a resize should be preferred when using bilinear resize, as it is faster to do the interpolation on the smaller image. However, nearest neighbor resize on the larger image followed by a crop on the smaller image can sometimes be faster.

Initializes the square resize crop.

Parameters:
  • size – The square height and width to resize to

  • interpolation – The interpolation type to use when resizing

forward(img: Image) Image[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class ml.tasks.datasets.transforms.UpperLeftCrop(height: int, width: int)[source]

Bases: Module

Crops image from upper left corner, to preserve image intrinsics.

Initializes the upper left crop.

Parameters:
  • height – The max height of the cropped image

  • width – The max width of the cropped image

forward(img: Image) Image[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.