ml.utils.video

Defines utilites for saving and loading video streams.

The main API for using this module is:

from ml.utils.video import read_video, write_video

def frame_iterator() -> Iterator[Tensor]:
    for frame in read_video("/path/to/video.mp4"):
        yield frame

write_video(frame_iterator(), "/path/to/other/video.mp4")

This just uses FFMPEG so it should be reasonably quick.

ml.utils.video.ffmpeg_python_available() bool[source]
ml.utils.video.mpl_available() bool[source]
ml.utils.video.cv2_available() bool[source]
class ml.utils.video.VideoProps(frame_width: int, frame_height: int, frame_count: int, fps: fractions.Fraction)[source]

Bases: object

frame_width: int
frame_height: int
frame_count: int
fps: Fraction
classmethod from_file_av(fpath: str | Path) VideoProps[source]
classmethod from_file_opencv(fpath: str | Path) VideoProps[source]
classmethod from_file_ffmpeg(fpath: str | Path) VideoProps[source]
ml.utils.video.read_video_av(in_file: str | Path, *, target_dims: tuple[int | None, int | None] | None = None) Iterator[ndarray][source]

Function that reads a video file to a stream of numpy arrays using PyAV.

Parameters:
  • in_file – The input video to read

  • target_dims – If not None, resize each frame to this size

Yields:

Frames from the video as numpy arrays with shape (H, W, C)

ml.utils.video.read_video_ffmpeg(in_file: str | Path, *, output_fmt: str = 'rgb24', channels: int = 3) Iterator[ndarray][source]

Function that reads a video file to a stream of numpy arrays using FFMPEG.

Parameters:
  • in_file – The input video to read

  • output_fmt – The output image format

  • channels – Number of output channels for each video frame

Yields:

Frames from the video as numpy arrays with shape (H, W, C)

async ml.utils.video.read_video_with_timestamps_ffmpeg(in_file: str | Path, *, output_fmt: str = 'rgb24', channels: int = 3, target_dims: tuple[int | None, int | None] | None = None) AsyncGenerator[tuple[numpy.ndarray, float], None][source]

Like read_video_ffmpeg but also returns timestamps.

Parameters:
  • in_file – The input video to read

  • output_fmt – The output image format

  • channels – Number of output channels for each video frame

  • target_dims – (width, height) dimensions for images being loaded, with None meaning that the aspect ratio should be kept the same

Yields:

Frames from the video as numpy arrays with shape (H, W, C), along with the frame timestamps

ml.utils.video.read_video_opencv(in_file: str | Path) Iterator[ndarray][source]

Reads a video as a stream using OpenCV.

Parameters:

in_file – The input video to read

Yields:

Frames from the video as numpy arrays with shape (H, W, C)

ml.utils.video.write_video_opencv(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, keep_resolution: bool = False, fps: int | Fraction = 30, codec: str = 'MP4V') None[source]

Function that writes a video from a stream of numpy arrays using OpenCV.

Parameters:
  • itr – The image iterator, yielding images with shape (H, W, C).

  • out_file – The path to the output file.

  • keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.

  • fps – Frames per second for the video.

  • codec – FourCC code specifying OpenCV video codec type. Examples are MPEG, MP4V, DIVX, AVC1, H236.

ml.utils.video.write_video_av(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, keep_resolution: bool = False, fps: int | Fraction = 30, codec: str = 'libx264', input_fmt: str = 'rgb24', output_fmt: str = 'yuv420p') None[source]

Function that writes an video from a stream of numpy arrays using PyAV.

Parameters:
  • itr – The image iterator, yielding images with shape (H, W, C).

  • out_file – The path to the output file.

  • keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.

  • fps – Frames per second for the video.

  • codec – The video codec to use for the output video

  • input_fmt – The input pixel format

  • output_fmt – The output pixel format

ml.utils.video.write_video_ffmpeg(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, keep_resolution: bool = False, fps: int | Fraction = 30, out_fps: int | Fraction = 30, vcodec: str = 'libx264', input_fmt: str = 'rgb24', output_fmt: str = 'yuv420p') None[source]

Function that writes an video from a stream of numpy arrays using FFMPEG.

Parameters:
  • itr – The image iterator, yielding images with shape (H, W, C).

  • out_file – The path to the output file.

  • keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.

  • fps – Frames per second for the video.

  • out_fps – Frames per second for the saved video.

  • vcodec – The video codec to use for the output video

  • input_fmt – The input image format

  • output_fmt – The output image format

ml.utils.video.write_video_matplotlib(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, keep_resolution: bool = False, dpi: int = 50, fps: int | Fraction = 30, title: str = 'Video', comment: str | None = None, writer: str = 'ffmpeg') None[source]

Function that writes an video from a stream of input tensors.

Parameters:
  • itr – The image iterator, yielding images with shape (H, W, C).

  • out_file – The path to the output file.

  • keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.

  • dpi – Dots per inch for output image.

  • fps – Frames per second for the video.

  • title – Title for the video metadata.

  • comment – Comment for the video metadata.

  • writer – The Matplotlib video writer to use (if you use the default one, make sure you have ffmpeg installed on your system).

ml.utils.video.get_video_props(in_file: str | Path, *, reader: Literal['ffmpeg', 'av', 'opencv'] = 'av') VideoProps[source]
ml.utils.video.read_video(in_file: str | Path, *, prefetch_n: int = 1, reader: Literal['ffmpeg', 'av', 'opencv'] = 'av') Iterator[ndarray][source]

Function that reads a video from a file to a stream of Numpy arrays.

Parameters:
  • in_file – The path to the input file.

  • prefetch_n – Number of chunks to prefetch.

  • reader – The video reader to use.

Yields:

The video frames as Numpy arrays.

ml.utils.video.write_video(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, fps: int | Fraction = 30, keep_resolution: bool = False, writer: Literal['ffmpeg', 'matplotlib', 'av', 'opencv'] = 'av') None[source]

Function that writes an video from a stream of input tensors.

Parameters:
  • itr – The image iterator, yielding images with shape (H, W, C).

  • out_file – The path to the output file.

  • fps – Frames per second for the video.

  • keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.

  • writer – The video writer to use.

Raises:

ValueError – If the writer is invalid.