load_audio_for_spec

ketos.audio.spectrogram.load_audio_for_spec(path, channel, rate, window, step, offset, duration, resample_method, id=None, normalize_wav=False, waveform_transforms=None, smooth=0.01, **kwargs)[source]

Load audio data from a wav file for the specific purpose of computing the spectrogram.

The loaded audio covers a time interval that extends slightly beyond that specified, [offset, offset+duration], as needed to compute the full spectrogram without padding with zeros at either end.

Moreover, the returned instance has two extra class attributes not usually associated with instances of the Waveform class,

  • stft_args: dict

    Parameters to be used for the computation of the Short-Time Fourier transform

  • len_extend: tuple(int,int)

    Length (no. samples) by which the time interval has been extended at both ends (left, right).

Returns None if the requested data segment is empty.

Args:
path: str

Path to wav file

channel: int

Channel to read from. Only relevant for stereo recordings

rate: float

Desired sampling rate in Hz. If None, the original sampling rate will be used

window: float

Window size in seconds that will be used for computing the spectrogram

step: float

Step size in seconds that will be used for computing the spectrogram

offset: float

Start time of spectrogram in seconds, relative the start of the wav file.

duration: float

Length of spectrogrma in seconds.

resample_method: str

Resampling method. Only relevant if rate is specified. Options are:

  • kaiser_best

  • kaiser_fast

  • scipy (default)

  • polyphase

See https://librosa.github.io/librosa/generated/librosa.core.resample.html for details on the individual methods.

id: str

Unique identifier (optional). If None, the filename will be used.

normalize_wav: bool

Normalize the waveform to have a mean of zero (mean=0) and a standard deviation of unity (std=1). Default is False.

smooth: float

Width in seconds of the smoothing region used for stitching together audio files.

**kwargs: additional keyword arguments

Keyword arguments to be passed to ketos.audio.Waveform.from_wav().

Returns:
audio: Waveform

The audio signal