BaseAudio
- class ketos.audio.base_audio.BaseAudio(data, filename='', offset=0, duration=None, label=None, annot=None, transforms=None, transform_log=None, **kwargs)[source]
Parent class for all audio classes.
While the underlying data array can be accessed via the
data
attribute, it is recommended to always use theget_data()
function to access the data array, i.e.,>>> from ketos.audio.base_audio import BaseAudio >>> x = np.ones(6) >>> audio_sample = BaseAudio(data=x) >>> audio_sample.get_data() array([1., 1., 1., 1., 1., 1.])
- Args:
- data: numpy array
Data
- filename: str
Filename of the original data file, if available (optional)
- offset: float
Position within the original data file, in seconds measured from the start of the file. Defaults to 0 if not specified.
- duration: float
Duration in seconds.
- label: int
Spectrogram label. Optional
- annot: AnnotationHandler
AnnotationHandler object. Optional
- transforms: list(dict)
List of dictionaries, where each dictionary specifies the name of a transformation and its arguments, if any. For example, {“name”:”normalize”, “mean”:0.5, “std”:1.0}
- Attributes:
- data: numpy array
Data
- ndim: int
Dimensionality of data.
- filename: str
Filename of the original data file, if available (optional)
- offset: float
Position within the original data file, in seconds measured from the start of the file. Defaults to 0 if not specified.
- label: int
Data label.
- annot: AnnotationHandler or pandas DataFrame
AnnotationHandler object.
- allowed_transforms: dict
Transforms that can be applied via the apply_transform method
- transform_log: list
List of transforms that have been applied to this object
Methods
adjust_range
([range])Applies a linear transformation to the data array that puts the values within the specified range.
annotate
(**kwargs)Add an annotation or a collection of annotations.
apply_transforms
(transforms)Apply specified transforms to the audio object.
average
([axis])Average value along selected axis
deepcopy
()Make a deep copy of the present instance
duration
()Data array duration in seconds
get
()Get a copy of this instance
Get annotations.
get_data
()Get underlying data.
Get filename.
Get instance attributes
Get keyword arguments required to create a copy of this instance.
get_label
([id])Get label.
Get offset.
Get audio representation attributes
infer_shape
(**kwargs)Infers the data shape that would result if the class were instantiated with a specific set of parameter values.
max
([axis])Maximum data value along selected axis
median
([axis])Median value along selected axis
min
([axis])Minimum data value along selected axis
normalize
([mean, std])Normalize the data array to specified mean and standard deviation.
std
([axis])Standard deviation along selected axis
View allowed transformations for this audio object.
- adjust_range(range=(0, 1))[source]
Applies a linear transformation to the data array that puts the values within the specified range.
- Args:
- range: tuple(float,float)
Minimum and maximum value of the desired range. Default is (0,1)
- annotate(**kwargs)[source]
Add an annotation or a collection of annotations.
Input arguments are described in
ketos.audio.annotation.AnnotationHandler.add()
- apply_transforms(transforms)[source]
Apply specified transforms to the audio object.
- Args:
- transforms: list(dict)
List of dictionaries, where each dictionary specifies the name of a transformation and its arguments, if any. For example, {“name”:”normalize”, “mean”:0.5, “std”:1.0}
- Returns:
None
- Example:
>>> from ketos.audio.waveform import Waveform >>> # read audio signal from wav file >>> wf = Waveform.from_wav('ketos/tests/assets/grunt1.wav') >>> # print allowed transforms >>> wf.view_allowed_transforms() ['normalize', 'adjust_range', 'crop', 'add_gaussian_noise', 'bandpass_filter'] >>> # apply gaussian normalization followed by cropping >>> transforms = [{'name':'normalize','mean':0.5,'std':1.0},{'name':'crop','start':0.2,'end':0.7}] >>> wf.apply_transforms(transforms) >>> # inspect record of applied transforms >>> wf.transform_log [{'name': 'normalize', 'mean': 0.5, 'std': 1.0}, {'name': 'crop', 'start': 0.2, 'end': 0.7, 'length': None}]
- average(axis=0)[source]
Average value along selected axis
- Args:
- axis: int
Axis along which metric is computed
- Returns:
- : array-like
Average value of the data array
- deepcopy()[source]
Make a deep copy of the present instance
See https://docs.python.org/2/library/copy.html
- Returns:
- : BaseAudio
Deep copy.
- duration()[source]
Data array duration in seconds
TODO: rename to get_duration()
- Returns:
- : float
Duration in seconds
- get_kwargs()[source]
Get keyword arguments required to create a copy of this instance.
Does not include the data array and annotation handler.
- static infer_shape(**kwargs)[source]
Infers the data shape that would result if the class were instantiated with a specific set of parameter values.
Returns a None value if duration or rate are not specified.
- Args:
- duration: float
Duration in seconds
- rate: float
Sampling rate in Hz
- Returns:
- : tuple
Inferred shape. If the parameter value do not allow the shape be inferred, a None value is returned.
- max(axis=0)[source]
Maximum data value along selected axis
- Args:
- axis: int
Axis along which metric is computed
- Returns:
- : array-like
Maximum value of the data array
- median(axis=0)[source]
Median value along selected axis
- Args:
- axis: int
Axis along which metric is computed
- Returns:
- : array-like
Median value of the data array
- min(axis=0)[source]
Minimum data value along selected axis
- Args:
- axis: int
Axis along which metric is computed
- Returns:
- : array-like
Minimum value of the data array
- normalize(mean=0, std=1)[source]
Normalize the data array to specified mean and standard deviation.
For the data array to be normalizable, it must have non-zero standard deviation. If this is not the case, the array is unchanged by calling this method.
- Args:
- mean: float
Mean value of the normalized array. The default is 0.
- std: float
Standard deviation of the normalized array. The default is 1.