batch_load_audio_file_data

ketos.neural_networks.dev_utils.detection.batch_load_audio_file_data(loader, batch_size, logger=None)[source]

This function generates batches of audio data from a given AudioFrameLoader.

Each batch consists of spectrogram data, filename, start time and end time of each audio segment. The loader loads audio files, splits into smaller segments and convert them into spectrograms.

Args:
loader: ketos.audio.audio_loader.AudioFrameLoader

An AudioFrameLoader object that computes spectrograms from the audio files as requested.

batch_size: int

The number of samples to include in each batch.

logger: logging.Logger or KetosLogger

A Logger instance to log errors encountered while loading audio file data.

Returns:
dict: A dictionary containing the batch data. The dictionary keys are ‘data’, ‘filename’,

‘start’, and ‘end’. The ‘data’ field is a numpy array of shape (batch_size, time_bins, freq_bins), where time_bins and freq_bins correspond to the shape of the spectrogram data. ‘filename’ contains the list of filenames for the audio segments in the batch. ‘start’ and ‘end’ are lists containing the start and end times (in seconds) of each audio segment.