ml.loggers.multi
Defines a general logger for munging logged values to an expected format.
This logger handles munging, rate limiting, and multiplexing logged values to each of the implemented child loggers. It is the logging interface that is exposed to the task and model.
- ml.loggers.multi.standardize_text(text: str, max_line_length: int | None = None, remove_non_ascii: bool = False) list[str] [source]
Standardizes a text string to a list of lines.
- Parameters:
text – The text to standardize
max_line_length – If set, truncate lines to this length
remove_non_ascii – Remove non-ASCII characters if present
- Returns:
The standardized text lines
- ml.loggers.multi.get_audio_channel(audio: Tensor, channel_select_mode: Literal['first', 'last', 'mean']) Tensor [source]
For stereo audio, selects a single channel.
- Parameters:
audio – The audio tensor to select a channel from, with shape (C, L)
channel_select_mode – The channel selection mode
- Returns:
The selected audio channel
- Raises:
ValueError – If the audio shape is invalid
- ml.loggers.multi.make_human_viewable_resolution(image: Tensor, interpolation: InterpolationMode = InterpolationMode.BILINEAR, trg_res: tuple[int, int] = (250, 250)) Tensor [source]
Resizes image to human-viewable resolution.
- Parameters:
image – The image to resize, with shape (C, H, W)
interpolation – Interpolation mode to use for image resizing
trg_res – The target image resolution; the image will be reshaped to have approximately the same area as an image with this resolution
- Returns:
The resized image
- ml.loggers.multi.standardize_image(image: Tensor, *, log_key: str | None = None, normalize: bool = True, keep_resolution: bool = False) Tensor [source]
Converts an arbitrary image to shape (C, H, W).
- Parameters:
image – The image tensor to log
log_key – An optional logging key to use in the exception message
normalize – Normalize images to (0, 1)
keep_resolution – If set, preserve original image resolution, otherwise change image resolution to human-viewable
- Returns:
The normalized image, with shape (C, H, W)
- Raises:
ValueError – If the image shape is invalid
- ml.loggers.multi.standardize_images(images: Tensor, labels: LabelT, *, max_images: int | None = None, log_key: str | None = None, normalize: bool = True, keep_resolution: bool = False) tuple[torch.Tensor, LabelT] [source]
Converts an arbitrary set of images to shape (B, C, H, W).
- Parameters:
images – The image tensor to log
labels – The labels for the images
max_images – Maximum number of images to select
log_key – An optional logging key to use in the exception message
normalize – Normalize images to (0, 1)
keep_resolution – If set, preserve original image resolution, otherwise change image resolution to human-viewable
- Returns:
The normalized image, with shape (B, C, H, W)
- Raises:
ValueError – If the image shape is invalid
- ml.loggers.multi.audio_warning_ticker() IntervalTicker [source]
- ml.loggers.multi.standardize_audio(audio: Tensor, *, log_key: str | None = None) Tensor [source]
Converts an arbitrary audio tensor to shape (C, T).
- Parameters:
audio – The audio tensor to log
log_key – An optional logging key to use in the exception message
- Returns:
The standardized audio tensor, with shape (C, T)
- Raises:
ValueError – If the audio shape is invalid
- ml.loggers.multi.standardize_audios(audios: Tensor, *, log_key: str | None = None, max_audios: int | None = None) Tensor [source]
Converts an arbitrary audio tensor to shape (B, C, T).
- Parameters:
audios – The audio tensor to log
log_key – An optional logging key to use in the exception message
max_audios – Maximum number of audios to select
- Returns:
The standardized audio tensor, with shape (B, C, T)
- Raises:
ValueError – If the audio shape is invalid
- ml.loggers.multi.separate_with_padding(audio: Tensor, sep_frames: int) Tensor [source]
Converts a (B, C, T) waveform to (C, B * (T + sep_frames) - sep_frames).
- Parameters:
audio – The audio tensor to separate
sep_frames – Number of frames to insert between each audio tensor
- Returns:
The separated audio tensor
- Raises:
ValueError – If the audio shape is invalid
- ml.loggers.multi.standardize_video(video: Tensor, *, log_key: str | None = None, normalize: bool = True) Tensor [source]
Converts an arbitrary video to shape (T, C, H, W).
- Parameters:
video – The video tensor to log
log_key – An optional logging key to use in the exception message
normalize – Normalize images to (0, 1)
- Returns:
The normalized video, with shape (T, C, H, W)
- Raises:
ValueError – If the video shape is invalid
- ml.loggers.multi.standardize_videos(videos: Tensor, *, max_videos: int | None = None, log_key: str | None = None, normalize: bool = True) Tensor [source]
Converts an arbitrary video to shape (B, T, C, H, W).
- Parameters:
videos – The video tensor to log
max_videos – Maximum number of images to select
log_key – An optional logging key to use in the exception message
normalize – Normalize images to (0, 1)
- Returns:
The normalized video, with shape (B, T, C, H, W)
- Raises:
ValueError – If the video shape is invalid
- ml.loggers.multi.image_with_text(image: Tensor, text: list[str], max_num_lines: int | None = None, line_spacing: int = 4, centered: bool = True) Tensor [source]
Adds a text label to an image.
- Parameters:
image – The image to label, with shape (C, H, W)
text – The text label for the image
max_num_lines – The number of lines of spacing to add to the bottom of the image
line_spacing – The spacing between adjacent lines
centered – If set, center the text labels, otherwise align to the left
- Returns:
The image with a text label
- ml.loggers.multi.normalize_video_fps(video: Tensor | list[torch.Tensor], fps: int | None, length: float | None, stack_dim: int = 0, target_fps: int = 12) Tensor [source]
Normalizes a video to have a particular FPS.
- Parameters:
video – The video to normalize, with shape (T, C, H, W)
fps – The desired frames per second
length – The desired video length, in seconds, at the target FPS
target_fps – The target frames per second for the logger
stack_dim – Which dimension to stack along, for lists
- Returns:
The normalized video
- ml.loggers.multi.standardize_point_cloud(value: Tensor, max_points: int, *, log_key: str | None) Tensor [source]
- ml.loggers.multi.make_square_image_or_video(images_or_videos: Tensor, *, sep: int = 0, squareness_weight: float = 1.0, emptiness_weight: float = 1.0) Tensor [source]
Makes a square image by concatenating all the child images.
This does a simple ternary search to minimize a squareness penalty and an emptiness penalty (i.e., the resulting image should be mostly filled in and also approximately square).
- Parameters:
images_or_videos – The images tensor, with shape (B, C, H, W) or (B, T, C, H, W)
sep – Some optional padding around the images
squareness_weight – Weight for number of non-square pixels in penalty
emptiness_weight – Weight for number of empty pixels in penalty
- Returns:
The square image, with shape (C, H’, W’) or (T, C, H’, W’)
- class ml.loggers.multi.MultiLogger(default_namespace: str = 'value')[source]
Bases:
object
Defines an intermediate container which holds values to log somewhere else.
- log_scalar(key: str, value: Callable[[], int | float | Tensor] | int | float | Tensor, *, namespace: str | None = None) None [source]
Logs a scalar value.
- Parameters:
key – The key being logged
value – The scalar value being logged
namespace – An optional logging namespace
- log_string(key: str, value: Callable[[], str] | str, *, namespace: str | None = None) None [source]
Logs a string value.
- Parameters:
key – The key being logged
value – The string value being logged
namespace – An optional logging namespace
- log_image(key: str, value: Callable[[], Tensor] | Tensor, *, namespace: str | None = None, keep_resolution: bool = False) None [source]
Logs an image.
- Parameters:
key – The key being logged
value – The image being logged; can be (C, H, W), (H, W, C) or (H, W) as an RGB (3 channel) or grayscale (1 channel) image
namespace – An optional logging namespace
keep_resolution – If set, keep the image resolution the same, otherwise upscale or downscale the image to a standard resolution
- log_labeled_image(key: str, value: Callable[[], tuple[torch.Tensor, str]] | tuple[torch.Tensor, str], *, namespace: str | None = None, max_line_length: int | None = None, keep_resolution: bool = False, centered: bool = True) None [source]
Logs an image with a label.
- Parameters:
key – The key being logged
value – The image and label being logged; the image can be (C, H, W), (H, W, C) or (H, W) as an RGB (3 channel) or grayscale (1 channel) image
namespace – An optional logging namespace
max_line_length – Labels longer than this length are wrapped around
keep_resolution – If set, keep the image resolution the same, otherwise upscale or downscale the image to a standard resolution
centered – If set, center the text labels, otherwise align to the left
- log_images(key: str, value: Callable[[], Tensor] | Tensor, *, namespace: str | None = None, keep_resolution: bool = False, max_images: int | None = None, sep: int = 0) None [source]
Logs a set of images.
The images are tiled to be nearly-square.
- Parameters:
key – The key being logged
value – The images being logged; can be (B, C, H, W), (B, H, W, C) or (B H, W) as an RGB (3 channel) or grayscale (1 channel) image
namespace – An optional logging namespace
keep_resolution – If set, keep the image resolution the same, otherwise upscale or downscale the image to a standard resolution
max_images – The maximum number of images to show; extra images are clipped
sep – An optional separation amount between adjacent images
- log_labeled_images(key: str, value: Callable[[], tuple[torch.Tensor, Sequence[str]]] | tuple[torch.Tensor, Sequence[str]], *, namespace: str | None = None, max_line_length: int | None = None, keep_resolution: bool = False, max_images: int | None = None, sep: int = 0, centered: bool = True) None [source]
Logs a set of images with labels.
The images are tiled to be nearly-square.
- Parameters:
key – The key being logged
value – The images and labels being logged; images can be (B, C, H, W), (B, H, W, C) or (B, H, W) as an RGB (3 channel) or grayscale (1 channel) image, with exactly B labels
namespace – An optional logging namespace
max_line_length – Labels longer than this length are wrapped around
keep_resolution – If set, keep the image resolution the same, otherwise upscale or downscale the image to a standard resolution
max_images – The maximum number of images to show; extra images are clipped
sep – An optional separation amount between adjacent images
centered – If set, center the text labels, otherwise align to the left
- log_audio(key: str, value: Callable[[], Tensor] | Tensor, *, namespace: str | None = None, sample_rate: int = 44100, log_spec: bool = True, n_fft_ms: float = 32.0, hop_length_ms: float | None = None, channel_select_mode: Literal['first', 'last', 'mean'] = 'first', keep_resolution: bool = False) None [source]
Logs an audio clip.
- Parameters:
key – The key being logged
value – The audio clip being logged; can be (C, T) or (T) as a mono (1 channel) or stereo (2 channel) audio clip
namespace – An optional logging namespace
sample_rate – The sample rate of the audio clip
log_spec – If set, also log the spectrogram
n_fft_ms – FFT size, in milliseconds
hop_length_ms – The FFT hop length, in milliseconds
channel_select_mode – How to select the channel if the audio is stereo; can be “first”, “last”, or “mean”; this is only used for the spectrogram
keep_resolution – If set, keep the resolution of the spectrogram; otherwise, make human-viewable
- log_audios(key: str, value: Callable[[], Tensor] | Tensor, *, namespace: str | None = None, sep_ms: float = 0.0, max_audios: int | None = None, sample_rate: int = 44100, log_spec: bool = True, n_fft_ms: float = 32.0, hop_length_ms: float | None = None, channel_select_mode: Literal['first', 'last', 'mean'] = 'first', spec_sep: int = 0, keep_resolution: bool = False) None [source]
Logs multiple audio clips.
- Parameters:
key – The key being logged
value – The audio clip being logged; can be (B, C, T) or (B, T) as a mono (1 channel) or stereo (2 channel) audio clip, with exactly B clips
namespace – An optional logging namespace
sep_ms – An optional separation amount between adjacent audio clips
max_audios – An optional maximum number of audio clips to log
sample_rate – The sample rate of the audio clip
log_spec – If set, also log the spectrogram
n_fft_ms – FFT size, in milliseconds
hop_length_ms – The FFT hop length, in milliseconds
channel_select_mode – How to select the channel if the audio is stereo; can be “first”, “last”, or “mean”; this is only used for the spectrogram
spec_sep – An optional separation amount between adjacent spectrograms
keep_resolution – If set, keep the resolution of the spectrogram; otherwise, make human-viewable
- log_spectrogram(key: str, value: Callable[[], Tensor] | Tensor, *, namespace: str | None = None, sample_rate: int = 44100, n_fft_ms: float = 32.0, hop_length_ms: float | None = None, channel_select_mode: Literal['first', 'last', 'mean'] = 'first', keep_resolution: bool = False) None [source]
Logs spectrograms of an audio clip.
- Parameters:
key – The key being logged
value – The audio clip being logged; can be (C, T) or (T) as a mono (1 channel) or stereo (2 channel) audio clip
namespace – An optional logging namespace
sample_rate – The sample rate of the audio clip
n_fft_ms – FFT size, in milliseconds
hop_length_ms – The FFT hop length, in milliseconds
channel_select_mode – How to select the channel if the audio is stereo; can be “first”, “last”, or “mean”; this is only used for the spectrogram
keep_resolution – If set, keep the resolution of the spectrogram; otherwise, make human-viewable
- log_spectrograms(key: str, value: Callable[[], Tensor] | Tensor, *, namespace: str | None = None, max_audios: int | None = None, sample_rate: int = 44100, n_fft_ms: float = 32.0, hop_length_ms: float | None = None, channel_select_mode: Literal['first', 'last', 'mean'] = 'first', spec_sep: int = 0, keep_resolution: bool = False) None [source]
Logs spectrograms of audio clips.
- Parameters:
key – The key being logged
value – The audio clip being logged; can be (B, C, T) or (B, T) as a mono (1 channel) or stereo (2 channel) audio clip, with exactly B clips
namespace – An optional logging namespace
max_audios – An optional maximum number of audio clips to log
sample_rate – The sample rate of the audio clip
n_fft_ms – FFT size, in milliseconds
hop_length_ms – The FFT hop length, in milliseconds
channel_select_mode – How to select the channel if the audio is stereo; can be “first”, “last”, or “mean”; this is only used for the spectrogram
spec_sep – An optional separation amount between adjacent spectrograms
keep_resolution – If set, keep the resolution of the spectrogram; otherwise, make human-viewable
- log_video(key: str, value: Callable[[], Tensor] | Tensor, *, namespace: str | None = None, fps: int | None = None, length: float | None = None) None [source]
Logs a video.
- Parameters:
key – The key being logged
value – The video being logged; the video can be (T, C, H, W), (T, H, W, C) or (T, H, W) as an RGB (3 channel) or grayscale (1 channel) video
namespace – An optional logging namespace
fps – The video frames per second
length – The desired video length, in seconds, at the target FPS
- log_videos(key: str, value: Callable[[], Tensor | list[torch.Tensor]] | Tensor | list[torch.Tensor], *, namespace: str | None = None, max_videos: int | None = None, sep: int = 0, fps: int | None = None, length: int | None = None) None [source]
Logs a set of video.
- Parameters:
key – The key being logged
value – The videos being logged; the video can be (B, T, C, H, W), (B, T, H, W, C) or (B T, H, W) as an RGB (3 channel) or grayscale (1 channel) video
namespace – An optional logging namespace
max_videos – The maximum number of videos to show; extra images are clipped
sep – An optional separation amount between adjacent videos
fps – The video frames per second
length – The desired video length, in seconds, at the target FPS
- log_histogram(key: str, value: Callable[[], Tensor] | Tensor, *, namespace: str | None = None) None [source]
Logs a histogram.
- Parameters:
key – The key being logged
value – The values to create a histogram from, with arbitrary shape
namespace – An optional logging namespace
- log_point_cloud(key: str, value: Callable[[], Tensor] | Tensor, *, namespace: str | None = None, max_points: int = 1000) None [source]
Logs a point cloud.
- Parameters:
key – The key being logged
value – The point cloud values, with shape (N, 3) or (B, …, 3); can pass multiple batches in order to show multiple point clouds
namespace – An optional logging namespace
max_points – An optional maximum number of points in the point cloud
- write_dict(loggers: list[ml.loggers.base.BaseLogger], values: dict[str, dict[str, Callable[[], LogT]]], state: State, func: Callable[[BaseLogger], Callable[[str, Callable[[], LogT], State, str], None]]) None [source]
- write(loggers: list[ml.loggers.base.BaseLogger], state: State) None [source]