Skip to content

producer

openseize.core.producer

Producers are the heart of Openseize. They are iterables that can produce values from a variety of data sources including:

    - Sequences
    - Numpy ndarrays
    - Generating functions
    - Openseize Reader instances
    - Other Producers

All DSP algorithms in Openseize can accept and return a producer. This allows data that is too large to be stored to an in-memory numpy array to be analyzed. This module contain the producer constructing function. It is the only publicly available method of this module.

Examples:

>>> # Build a producer from an EDF file reader
>>> from openseize.demos import paths
>>> filepath = paths.locate('recording_001.edf')
>>> from openseize.file_io.edf import Reader
>>> reader = Reader(filepath)
>>> # build a producer that produces 100k samples chunks of this file
>>> pro = producer(reader, chunksize=10e3, axis=-1)
>>> pro.shape # print the producers shape
(4, 18875000)
>>> # print the shape of each arr in the producer
>>> for idx, arr in enumerate(pro):
...     msg = 'Array num. {} has shape {}'
...     print(msg.format(idx, arr.shape))
>>> # Build a producer from a numpy array with samples on 0th
>>> x = np.random.random((100000, 5))
>>> xpro = producer(x, chunksize=10e3, axis=0)
>>> for arr in xpro:
...     print(arr.shape)

producer(data, chunksize, axis, shape=None, mask=None, **kwargs)

Constructs an iterable that produces ndarrays of length chunksize along axis during iteration.

This constructor returns an object that is capable of producing ndarrays or masked ndarrays during iteration from a single ndarray, a sequence of ndarrays, a file Reader instance (see io.bases.Reader), an ndarray generating function, or a pre-existing producer of ndarrays. The produced ndarrays from this object will have length chunksize along axis.

Parameters:

Name Type Description Default
data Union[npt.NDArray, Iterable[npt.NDArray], Reader, Callable, Producer]

An object from which ndarrays will be produced from. Supported types are Reader instances, ndarrays, sequences, generating functions yielding ndarrays, or other Producers. For sequences and generator functions it is required that each subarray has the same shape along all axes except for the axis along which chunks will be produced.

required
chunksize int

The desired length along axis of each produced ndarray.

required
axis int

The sample axis of data that will be partitioned into chunks of length chunksize.

required
shape Optional[Sequence[int]]

The combined shape of all ndarrays from this producer. This parameter is only required when object is a generating function and will be ignored otherwise.

None
mask Optional[npt.NDArray[np.bool_]]

A boolean describing which values of data along axis should by produced. Values that are True will be produced and values that are False will be ignored. If None (Default), producer will produce all values from object.

None
kwargs

Keyword arguments specific to data type that ndarrays will be produced from. For Reader instances, valid kwargs are padvalue (see io.bases.Readers and io.edf.Reader) For generating functions, all the positional and keyword arguments must be passed to the function through these kwargs to avoid name collisions with the producer func arguments.

{}
Source code in openseize/core/producer.py
def producer(data: Union[npt.NDArray, Iterable[npt.NDArray], Reader,
                         Callable, 'Producer'],
             chunksize: int,
             axis: int,
             shape: Optional[Sequence[int]] = None,
             mask: Optional[npt.NDArray[np.bool_]] = None,
             **kwargs,
) -> 'Producer':
    """Constructs an iterable that produces ndarrays of length chunksize
    along axis during iteration.

    This constructor returns an object that is capable of producing ndarrays
    or masked ndarrays during iteration from a single ndarray, a sequence of
    ndarrays, a file Reader instance (see io.bases.Reader), an ndarray
    generating function, or a pre-existing producer of ndarrays. The
    produced ndarrays from this object will have length chunksize along axis.

    Args:
        data:
            An object from which ndarrays will be produced from. Supported
            types are Reader instances, ndarrays, sequences, generating
            functions yielding ndarrays, or other Producers.  For sequences
            and generator functions it is required that each subarray has
            the same shape along all axes except for the axis along which
            chunks will be produced.
        chunksize:
            The desired length along axis of each produced ndarray.
        axis:
            The sample axis of data that will be partitioned into
            chunks of length chunksize.
        shape:
            The combined shape of all ndarrays from this producer. This
            parameter is only required when object is a generating function
            and will be ignored otherwise.
        mask:
            A boolean describing which values of data along axis
            should by produced. Values that are True will be produced and
            values that are False will be ignored. If None (Default),
            producer will produce all values from object.
        kwargs:
            Keyword arguments specific to data type that ndarrays will be
            produced from. For Reader instances, valid kwargs are padvalue
            (see io.bases.Readers and io.edf.Reader) For generating
            functions, all the positional and keyword arguments must be
            passed to the function through these kwargs to avoid name
            collisions with the producer func arguments.

    Returns: An iterable of ndarrays of shape chunksize along axis.
    """

    if isinstance(data, Producer):
        data.chunksize = int(chunksize)
        data.axis = axis
        result = data

    elif isinstance(data, Reader):
        result = ReaderProducer(data, chunksize, axis, **kwargs)

    elif inspect.isgeneratorfunction(data):
        result = GenProducer(data, chunksize, axis, shape, **kwargs)

    elif isinstance(data, np.ndarray):
        result = ArrayProducer(data, chunksize, axis, **kwargs)

    elif isinstance(data, abc.Sequence):
        x = np.concatenate(data, axis)
        result = ArrayProducer(x, chunksize, axis, **kwargs)

    else:
        msg = 'unproducible type: {}'
        raise TypeError(msg.format(type(data)))

    # apply mask if passed
    if mask is None:
        return result

    return MaskedProducer(result, mask, chunksize, axis, **kwargs)