AuralFeatures

class ketos.audio.gammatone.AuralFeatures(data, filename=None, offset=0, label=None, annot=None, waveform_transform_log=None, **kwargs)[source]

Aural features computed with the aural-features package (https://pypi.org/project/aural-features/).

Args:
data: 1d numpy array

Feature values

filename: str

Name of the source audio file, if available.

offset: float

Position in seconds of the left edge of the audio segment within the source audio file, if available.

label: int

Label. Optional

annot: AnnotationHandler

AnnotationHandler object. Optional

Methods

from_wav(path[, filter_pad_samples, ...])

Compute aural features directly from wav file.

from_waveform(audio[, filter_pad_samples, ...])

Compute aural features from an instance of audio_signal.Waveform.

get_repres_attrs()

Get audio representation attributes

classmethod from_wav(path, filter_pad_samples=64, global_km_window_seconds=0.25, local_km_window_seconds=0.008, filter_n=100, filter_min_hz=50, channel=0, rate=None, offset=0, duration=None, resample_method='scipy', id=None, normalize_wav=False, waveform_transforms=None, **kwargs)[source]

Compute aural features directly from wav file.

The arguments offset and duration can be used to select a portion of the wav file.

Note that values specified for the arguments offset and duration may be subject to slight adjustments to ensure that the selected portion corresponds to an integer number of samples.

Args:
path: str

Path to wav file

filter_min: float

Min filter frequency in Hz

local_km_window: float

Length of local KM window in seconds

channel: int

Channel to read from. Only relevant for stereo recordings

rate: float

Desired sampling rate in Hz. If None, the original sampling rate will be used

offset: float

Start time of selection in seconds, relative the start of the wav file.

duration: float

Length of selection in seconds.

resample_method: str
Resampling method. Only relevant if rate is specified. Options are
  • kaiser_best

  • kaiser_fast

  • scipy (default)

  • polyphase

See https://librosa.github.io/librosa/generated/librosa.core.resample.html for details on the individual methods.

id: str

Unique identifier (optional). If None, the filename will be used.

normalize_wav: bool

Normalize the waveform to have a mean of zero (mean=0) and a standard deviation of unity (std=1) before computing the spectrogram. Default is False.

waveform_transforms: list(dict)

List of dictionaries, where each dictionary specifies the name of a transformation to be applied to the waveform before generating the spectrogram. For example, {“name”:”add_gaussian_noise”, “sigma”:0.5}

Returns:
: AuralFeatures

Aural features

classmethod from_waveform(audio, filter_pad_samples=64, global_km_window_seconds=0.25, local_km_window_seconds=0.008, filter_n=100, filter_min_hz=50)[source]

Compute aural features from an instance of audio_signal.Waveform.

Args:
audio: Waveform

Audio signal

filter_pad_samples: int

Number of samples used for padding

global_km_window_seconds: float

Length of global KM window in seconds

local_km_window_seconds: float

Length of local KM window in seconds

filter_n: int

Number of filters

filter_min_hz: float

Min filter frequency in Hz

Returns:
: AuralFeatures

Aural features

get_repres_attrs()[source]

Get audio representation attributes