ml.tasks.datasets.async_iterable

Defines a dataset for asynchronous iteration.

This dataclass is useful when you are trying to use Python’s async / await syntax to iterate over a dataset. This just starts a separate thread that runs the async iterator and puts the results into a queue, which is then used to iterate over the dataset.

Example:

class MyDataset(AsyncIterableDataset):
    async def __aiter__(self) -> AsyncIterator[T]:
        for i in range(10):
            yield i

for i in MyDataset():
    print(i)
async ml.tasks.datasets.async_iterable.add_to_queue(async_iter: AsyncIterator[T], q: Queue[T | None]) None[source]
ml.tasks.datasets.async_iterable.thread_worker(async_iter: AsyncIterator[T], q: Queue[T | None]) None[source]
ml.tasks.datasets.async_iterable.thread_async_iter(async_iter: AsyncIterator[T], max_queue_size: int) Iterator[T][source]
class ml.tasks.datasets.async_iterable.AsyncIterableDataset(max_async_queue_size: int = 2)[source]

Bases: IterableDataset[T]