ml.utils.video
Defines utilites for saving and loading video streams.
The main API for using this module is:
from ml.utils.video import read_video, write_video
def frame_iterator() -> Iterator[Tensor]:
for frame in read_video("/path/to/video.mp4"):
yield frame
write_video(frame_iterator(), "/path/to/other/video.mp4")
This just uses FFMPEG so it should be reasonably quick.
- class ml.utils.video.VideoProps(frame_width: int, frame_height: int, frame_count: int, fps: fractions.Fraction)[source]
Bases:
object
- frame_width: int
- frame_height: int
- frame_count: int
- fps: Fraction
- classmethod from_file_av(fpath: str | Path) VideoProps [source]
- classmethod from_file_opencv(fpath: str | Path) VideoProps [source]
- classmethod from_file_ffmpeg(fpath: str | Path) VideoProps [source]
- ml.utils.video.read_video_av(in_file: str | Path, *, target_dims: tuple[int | None, int | None] | None = None) Iterator[ndarray] [source]
Function that reads a video file to a stream of numpy arrays using PyAV.
- Parameters:
in_file – The input video to read
target_dims – If not None, resize each frame to this size
- Yields:
Frames from the video as numpy arrays with shape (H, W, C)
- ml.utils.video.read_video_ffmpeg(in_file: str | Path, *, output_fmt: str = 'rgb24', channels: int = 3) Iterator[ndarray] [source]
Function that reads a video file to a stream of numpy arrays using FFMPEG.
- Parameters:
in_file – The input video to read
output_fmt – The output image format
channels – Number of output channels for each video frame
- Yields:
Frames from the video as numpy arrays with shape (H, W, C)
- async ml.utils.video.read_video_with_timestamps_ffmpeg(in_file: str | Path, *, output_fmt: str = 'rgb24', channels: int = 3, target_dims: tuple[int | None, int | None] | None = None) AsyncGenerator[tuple[numpy.ndarray, float], None] [source]
Like read_video_ffmpeg but also returns timestamps.
- Parameters:
in_file – The input video to read
output_fmt – The output image format
channels – Number of output channels for each video frame
target_dims – (width, height) dimensions for images being loaded, with None meaning that the aspect ratio should be kept the same
- Yields:
Frames from the video as numpy arrays with shape (H, W, C), along with the frame timestamps
- ml.utils.video.read_video_opencv(in_file: str | Path) Iterator[ndarray] [source]
Reads a video as a stream using OpenCV.
- Parameters:
in_file – The input video to read
- Yields:
Frames from the video as numpy arrays with shape (H, W, C)
- ml.utils.video.write_video_opencv(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, keep_resolution: bool = False, fps: int | Fraction = 30, codec: str = 'MP4V') None [source]
Function that writes a video from a stream of numpy arrays using OpenCV.
- Parameters:
itr – The image iterator, yielding images with shape (H, W, C).
out_file – The path to the output file.
keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.
fps – Frames per second for the video.
codec – FourCC code specifying OpenCV video codec type. Examples are MPEG, MP4V, DIVX, AVC1, H236.
- ml.utils.video.write_video_av(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, keep_resolution: bool = False, fps: int | Fraction = 30, codec: str = 'libx264', input_fmt: str = 'rgb24', output_fmt: str = 'yuv420p') None [source]
Function that writes an video from a stream of numpy arrays using PyAV.
- Parameters:
itr – The image iterator, yielding images with shape (H, W, C).
out_file – The path to the output file.
keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.
fps – Frames per second for the video.
codec – The video codec to use for the output video
input_fmt – The input pixel format
output_fmt – The output pixel format
- ml.utils.video.write_video_ffmpeg(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, keep_resolution: bool = False, fps: int | Fraction = 30, out_fps: int | Fraction = 30, vcodec: str = 'libx264', input_fmt: str = 'rgb24', output_fmt: str = 'yuv420p') None [source]
Function that writes an video from a stream of numpy arrays using FFMPEG.
- Parameters:
itr – The image iterator, yielding images with shape (H, W, C).
out_file – The path to the output file.
keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.
fps – Frames per second for the video.
out_fps – Frames per second for the saved video.
vcodec – The video codec to use for the output video
input_fmt – The input image format
output_fmt – The output image format
- ml.utils.video.write_video_matplotlib(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, keep_resolution: bool = False, dpi: int = 50, fps: int | Fraction = 30, title: str = 'Video', comment: str | None = None, writer: str = 'ffmpeg') None [source]
Function that writes an video from a stream of input tensors.
- Parameters:
itr – The image iterator, yielding images with shape (H, W, C).
out_file – The path to the output file.
keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.
dpi – Dots per inch for output image.
fps – Frames per second for the video.
title – Title for the video metadata.
comment – Comment for the video metadata.
writer – The Matplotlib video writer to use (if you use the default one, make sure you have ffmpeg installed on your system).
- ml.utils.video.get_video_props(in_file: str | Path, *, reader: Literal['ffmpeg', 'av', 'opencv'] = 'av') VideoProps [source]
- ml.utils.video.read_video(in_file: str | Path, *, prefetch_n: int = 1, reader: Literal['ffmpeg', 'av', 'opencv'] = 'av') Iterator[ndarray] [source]
Function that reads a video from a file to a stream of Numpy arrays.
- Parameters:
in_file – The path to the input file.
prefetch_n – Number of chunks to prefetch.
reader – The video reader to use.
- Yields:
The video frames as Numpy arrays.
- ml.utils.video.write_video(itr: Iterator[ndarray | Tensor], out_file: str | Path, *, fps: int | Fraction = 30, keep_resolution: bool = False, writer: Literal['ffmpeg', 'matplotlib', 'av', 'opencv'] = 'av') None [source]
Function that writes an video from a stream of input tensors.
- Parameters:
itr – The image iterator, yielding images with shape (H, W, C).
out_file – The path to the output file.
fps – Frames per second for the video.
keep_resolution – If set, don’t change the image resolution, otherwise resize to a human-friendly resolution.
writer – The video writer to use.
- Raises:
ValueError – If the writer is invalid.