AuralFeatures

class ketos.audio.gammatone.AuralFeatures(data, filename=None, offset=0, label=None, annot=None, waveform_transform_log=None, **kwargs)[source]

Aural features computed with the aural-features package (https://pypi.org/project/aural-features/).

Args:

data: 1d numpy array: Feature values
filename: str: Name of the source audio file, if available.
offset: float: Position in seconds of the left edge of the audio segment within the source audio file, if available.
label: int: Label. Optional
annot: AnnotationHandler: AnnotationHandler object. Optional

Methods

`from_wav`(path[, filter_pad_samples, ...])	Compute aural features directly from wav file.
`from_waveform`(audio[, filter_pad_samples, ...])	Compute aural features from an instance of `audio_signal.Waveform`.
`get_repres_attrs`()	Get audio representation attributes

classmethod from_wav(path, filter_pad_samples=64, global_km_window_seconds=0.25, local_km_window_seconds=0.008, filter_n=100, filter_min_hz=50, channel=0, rate=None, offset=0, duration=None, resample_method='scipy', id=None, normalize_wav=False, waveform_transforms=None, **kwargs)[source]

Compute aural features directly from wav file.

The arguments offset and duration can be used to select a portion of the wav file.

Note that values specified for the arguments offset and duration may be subject to slight adjustments to ensure that the selected portion corresponds to an integer number of samples.

Args:

path: str

Path to wav file

filter_min: float

Min filter frequency in Hz

local_km_window: float

Length of local KM window in seconds

channel: int

Channel to read from. Only relevant for stereo recordings

rate: float

Desired sampling rate in Hz. If None, the original sampling rate will be used

offset: float

Start time of selection in seconds, relative the start of the wav file.

duration: float

Length of selection in seconds.

resample_method: str

Resampling method. Only relevant if rate is specified. Options are

kaiser_best
kaiser_fast
scipy (default)
polyphase

See https://librosa.github.io/librosa/generated/librosa.core.resample.html for details on the individual methods.

id: str

Unique identifier (optional). If None, the filename will be used.

normalize_wav: bool

Normalize the waveform to have a mean of zero (mean=0) and a standard deviation of unity (std=1) before computing the spectrogram. Default is False.

waveform_transforms: list(dict)

List of dictionaries, where each dictionary specifies the name of a transformation to be applied to the waveform before generating the spectrogram. For example, {“name”:”add_gaussian_noise”, “sigma”:0.5}

Returns:

: AuralFeatures: Aural features

classmethod from_waveform(audio, filter_pad_samples=64, global_km_window_seconds=0.25, local_km_window_seconds=0.008, filter_n=100, filter_min_hz=50)[source]

Compute aural features from an instance of audio_signal.Waveform.

Args:

audio: Waveform: Audio signal
filter_pad_samples: int: Number of samples used for padding
global_km_window_seconds: float: Length of global KM window in seconds
local_km_window_seconds: float: Length of local KM window in seconds
filter_n: int: Number of filters
filter_min_hz: float: Min filter frequency in Hz

Returns:

: AuralFeatures: Aural features

get_repres_attrs()[source]: Get audio representation attributes