AuralFeatures
- class ketos.audio.gammatone.AuralFeatures(data, filename=None, offset=0, label=None, annot=None, waveform_transform_log=None, **kwargs)[source]
Aural features computed with the aural-features package (https://pypi.org/project/aural-features/).
- Args:
- data: 1d numpy array
Feature values
- filename: str
Name of the source audio file, if available.
- offset: float
Position in seconds of the left edge of the audio segment within the source audio file, if available.
- label: int
Label. Optional
- annot: AnnotationHandler
AnnotationHandler object. Optional
Methods
from_wav
(path[, filter_pad_samples, ...])Compute aural features directly from wav file.
from_waveform
(audio[, filter_pad_samples, ...])Compute aural features from an instance of
audio_signal.Waveform
.Get audio representation attributes
- classmethod from_wav(path, filter_pad_samples=64, global_km_window_seconds=0.25, local_km_window_seconds=0.008, filter_n=100, filter_min_hz=50, channel=0, rate=None, offset=0, duration=None, resample_method='scipy', id=None, normalize_wav=False, waveform_transforms=None, **kwargs)[source]
Compute aural features directly from wav file.
The arguments offset and duration can be used to select a portion of the wav file.
Note that values specified for the arguments offset and duration may be subject to slight adjustments to ensure that the selected portion corresponds to an integer number of samples.
- Args:
- path: str
Path to wav file
- filter_min: float
Min filter frequency in Hz
- local_km_window: float
Length of local KM window in seconds
- channel: int
Channel to read from. Only relevant for stereo recordings
- rate: float
Desired sampling rate in Hz. If None, the original sampling rate will be used
- offset: float
Start time of selection in seconds, relative the start of the wav file.
- duration: float
Length of selection in seconds.
- resample_method: str
- Resampling method. Only relevant if rate is specified. Options are
kaiser_best
kaiser_fast
scipy (default)
polyphase
See https://librosa.github.io/librosa/generated/librosa.core.resample.html for details on the individual methods.
- id: str
Unique identifier (optional). If None, the filename will be used.
- normalize_wav: bool
Normalize the waveform to have a mean of zero (mean=0) and a standard deviation of unity (std=1) before computing the spectrogram. Default is False.
- waveform_transforms: list(dict)
List of dictionaries, where each dictionary specifies the name of a transformation to be applied to the waveform before generating the spectrogram. For example, {“name”:”add_gaussian_noise”, “sigma”:0.5}
- Returns:
- : AuralFeatures
Aural features
- classmethod from_waveform(audio, filter_pad_samples=64, global_km_window_seconds=0.25, local_km_window_seconds=0.008, filter_n=100, filter_min_hz=50)[source]
Compute aural features from an instance of
audio_signal.Waveform
.- Args:
- audio: Waveform
Audio signal
- filter_pad_samples: int
Number of samples used for padding
- global_km_window_seconds: float
Length of global KM window in seconds
- local_km_window_seconds: float
Length of local KM window in seconds
- filter_n: int
Number of filters
- filter_min_hz: float
Min filter frequency in Hz
- Returns:
- : AuralFeatures
Aural features