ml.utils.video

Defines utilites for saving and loading video streams.

The main API for using this module is:

from ml.utils.video import read_video, write_video

def frame_iterator() -> Iterator[Tensor]:
    for frame in read_video("/path/to/video.mp4"):
        yield frame

write_video(frame_iterator(), "/path/to/other/video.mp4")

This just uses FFMPEG so it should be reasonably quick.

ml.utils.video.ffmpeg_python_available() → bool[source]

ml.utils.video.mpl_available() → bool[source]

ml.utils.video.cv2_available() → bool[source]

class ml.utils.video.VideoProps(frame_width: int, frame_height: int, frame_count: int, fps: fractions.Fraction)[source]

Bases: object

frame_width: int

frame_height: int

frame_count: int

fps: Fraction

classmethod from_file_av(fpath: str | Path) → VideoProps[source]

classmethod from_file_opencv(fpath: str | Path) → VideoProps[source]

classmethod from_file_ffmpeg(fpath: str | Path) → VideoProps[source]

ml.utils.video.read_video_av(in_file: str | Path, *, target_dims: tuple[int | None, int | None] | None = None) → Iterator[ndarray][source]

Function that reads a video file to a stream of numpy arrays using PyAV.

Parameters:

in_file – The input video to read
target_dims – If not None, resize each frame to this size

Yields:

Frames from the video as numpy arrays with shape (H, W, C)

ml.utils.video.read_video_ffmpeg(in_file: str | Path, *, output_fmt: str = 'rgb24', channels: int = 3) → Iterator[ndarray][source]

Function that reads a video file to a stream of numpy arrays using FFMPEG.

Parameters:

in_file – The input video to read
output_fmt – The output image format
channels – Number of output channels for each video frame

Yields:

Frames from the video as numpy arrays with shape (H, W, C)

async ml.utils.video.read_video_with_timestamps_ffmpeg(in_file: str | Path, *, output_fmt: str = 'rgb24', channels: int = 3, target_dims: tuple[int | None, int | None] | None = None) → AsyncGenerator[tuple[numpy.ndarray, float], None][source]

Like read_video_ffmpeg but also returns timestamps.

Parameters:

in_file – The input video to read
output_fmt – The output image format
channels – Number of output channels for each video frame
target_dims – (width, height) dimensions for images being loaded, with None meaning that the aspect ratio should be kept the same

Yields:

Frames from the video as numpy arrays with shape (H, W, C), along with the frame timestamps

ml.utils.video.read_video_opencv(in_file: str | Path) → Iterator[ndarray][source]

Reads a video as a stream using OpenCV.

Parameters:: in_file – The input video to read
Yields:: Frames from the video as numpy arrays with shape (H, W, C)

ml.utils.video.write_video_opencv(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, keep_resolution: bool = False, fps: int | Fraction = 30, codec: str = 'MP4V') → None[source]

Function that writes a video from a stream of numpy arrays using OpenCV.

Parameters:

itr – The image iterator, yielding images with shape (H, W, C).
out_file – The path to the output file.
keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.
fps – Frames per second for the video.
codec – FourCC code specifying OpenCV video codec type. Examples are MPEG, MP4V, DIVX, AVC1, H236.

ml.utils.video.write_video_av(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, keep_resolution: bool = False, fps: int | Fraction = 30, codec: str = 'libx264', input_fmt: str = 'rgb24', output_fmt: str = 'yuv420p') → None[source]

Function that writes an video from a stream of numpy arrays using PyAV.

Parameters:

itr – The image iterator, yielding images with shape (H, W, C).
out_file – The path to the output file.
keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.
fps – Frames per second for the video.
codec – The video codec to use for the output video
input_fmt – The input pixel format
output_fmt – The output pixel format

ml.utils.video.write_video_ffmpeg(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, keep_resolution: bool = False, fps: int | Fraction = 30, out_fps: int | Fraction = 30, vcodec: str = 'libx264', input_fmt: str = 'rgb24', output_fmt: str = 'yuv420p') → None[source]

Function that writes an video from a stream of numpy arrays using FFMPEG.

Parameters:

itr – The image iterator, yielding images with shape (H, W, C).
out_file – The path to the output file.
keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.
fps – Frames per second for the video.
out_fps – Frames per second for the saved video.
vcodec – The video codec to use for the output video
input_fmt – The input image format
output_fmt – The output image format

ml.utils.video.write_video_matplotlib(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, keep_resolution: bool = False, dpi: int = 50, fps: int | Fraction = 30, title: str = 'Video', comment: str | None = None, writer: str = 'ffmpeg') → None[source]

Function that writes an video from a stream of input tensors.

Parameters:

itr – The image iterator, yielding images with shape (H, W, C).
out_file – The path to the output file.
keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.
dpi – Dots per inch for output image.
fps – Frames per second for the video.
title – Title for the video metadata.
comment – Comment for the video metadata.
writer – The Matplotlib video writer to use (if you use the default one, make sure you have ffmpeg installed on your system).

ml.utils.video.get_video_props(in_file: str | Path, *, reader: Literal['ffmpeg', 'av', 'opencv'] = 'av') → VideoProps[source]

ml.utils.video.read_video(in_file: str | Path, *, prefetch_n: int = 1, reader: Literal['ffmpeg', 'av', 'opencv'] = 'av') → Iterator[ndarray][source]

Function that reads a video from a file to a stream of Numpy arrays.

Parameters:

in_file – The path to the input file.
prefetch_n – Number of chunks to prefetch.
reader – The video reader to use.

Yields:

The video frames as Numpy arrays.

ml.utils.video.write_video(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, fps: int | Fraction = 30, keep_resolution: bool = False, writer: Literal['ffmpeg', 'matplotlib', 'av', 'opencv'] = 'av') → None[source]

Function that writes an video from a stream of input tensors.

Parameters:

itr – The image iterator, yielding images with shape (H, W, C).
out_file – The path to the output file.
fps – Frames per second for the video.
keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.
writer – The video writer to use.

Raises:

ValueError – If the writer is invalid.