Annotation

‘audio.annotation’ module within the ketos library

This module provides utilities to handle annotations associated with waveform and spectrogram objects.

Contents:

AnnotationHandler class

class ketos.audio.annotation.AnnotationHandler(df=None)[source]

Bases: object

Class for handling annotations of acoustic data.

An annotation is characterized by

  • start and end time in seconds

  • minimum and maximum frequency in Hz (optional)

  • label (integer)

The AnnotationHandler stores annotations in a pandas DataFrame and offers methods to add/get annotations and perform various manipulations such as cropping, shifting, and segmenting.

Multiple levels of indexing is used for handling several, stacked annotation sets:

  • level 0: annotation set

  • level 1: individual annotation

Args:
df: pandas DataFrame

Annotations to be passed on to the handler. Must contain the columns ‘label’, ‘start’, and ‘end’, and optionally also ‘freq_min’ and ‘freq_max’.

add(label=None, start=None, end=None, freq_min=None, freq_max=None, df=None, id=0)[source]

Add an annotation or a collection of annotations to the handler module.

Individual annotations may be added using the arguments start, end, freq_min, and freq_max.

Groups of annotations may be added by first collecting them in a pandas DataFrame or dictionary and then adding them using the ‘df’ argument.

Args:
label: int

Integer label.

start: str or float

Start time. Can be specified either as a float, in which case the unit will be assumed to be seconds, or as a string with an SI unit, for example, ‘22min’.

start: str or float

Stop time. Can be specified either as a float, in which case the unit will be assumed to be seconds, or as a string with an SI unit, for example, ‘22min’.

freq_min: str or float

Lower frequency. Can be specified either as a float, in which case the unit will be assumed to be Hz, or as a string with an SI unit, for example, ‘3.1kHz’.

freq_max: str or float

Upper frequency. Can be specified either as a float, in which case the unit will be assumed to be Hz, or as a string with an SI unit, for example, ‘3.1kHz’.

df: pandas DataFrame or dict

Annotations stored in a pandas DataFrame or dict. Must have columns/keys ‘label’, ‘start’, ‘end’, and optionally also ‘freq_min’ and ‘freq_max’.

id: int or tuple

Unique identifier of the annotation subset.

Returns:

None

Example:
>>> from ketos.audio.annotation import AnnotationHandler
>>> # Create an annotation table containing two annotations
>>> annots = pd.DataFrame({'label':[1,2], 'start':[4.,8.], 'end':[6.,12.]})
>>> # Initialize the annotation handler
>>> handler = AnnotationHandler(annots)
>>> # Add a couple of more annotations
>>> handler.add(label=1, start='1min', end='2min')
>>> handler.add(label=3, start='11min', end='12min')
>>> # Inspect the annotations
>>> annot = handler.get()
>>> print(annot)
   label  start    end  freq_min  freq_max
0      1    4.0    6.0       NaN       NaN
1      2    8.0   12.0       NaN       NaN
2      1   60.0  120.0       NaN       NaN
3      3  660.0  720.0       NaN       NaN
copy()[source]
crop(start=0, end=None, freq_min=None, freq_max=None, make_copy=False)[source]

Crop annotations along the time and/or frequency dimension.

Args:
start: float or str

Lower edge of time cropping interval. Can be specified either as a float, in which case the unit will be assumed to be seconds, or as a string with an SI unit, for example, ‘22min’

end: float or str

Upper edge of time cropping interval. Can be specified either as a float, in which case the unit will be assumed to be seconds, or as a string with an SI unit, for example, ‘22min’

freq_min: float or str

Lower edge of frequency cropping interval. Can be specified either as a float, in which case the unit will be assumed to be Hz, or as a string with an SI unit, for example, ‘3.1kHz’

freq_max: float or str

Upper edge of frequency cropping interval. Can be specified either as a float, in which case the unit will be assumed to be Hz, or as a string with an SI unit, for example, ‘3.1kHz’

Returns:

None

Example:
>>> from ketos.audio.annotation import AnnotationHandler
>>> # Initialize an empty annotation handler
>>> handler = AnnotationHandler()
>>> # Add a couple of annotations
>>> handler.add(label=1, start='1min', end='2min', freq_min='20Hz', freq_max='200Hz')
>>> handler.add(label=2, start='180s', end='300s', freq_min='60Hz', freq_max='1000Hz')
>>> # Crop the annotations in time
>>> handler.crop(start='30s', end='4min')
>>> # Inspect the annotations
>>> annot = handler.get()
>>> print(annot)
   label  start    end  freq_min  freq_max
0      1   30.0   90.0      20.0     200.0
1      2  150.0  210.0      60.0    1000.0
>>> # Note how all the start and stop times are shifted by -30 s due to the cropping operation.
>>> # Crop the annotations in frequency
>>> handler.crop(freq_min='50Hz')
>>> annot = handler.get()
>>> print(annot)
   label  start    end  freq_min  freq_max
0      1   30.0   90.0      50.0     200.0
1      2  150.0  210.0      60.0    1000.0
get(label=None, id=None, squeeze=True, drop_freq=False, key_error=False)[source]

Get annotations managed by the handler module.

Note: This returns a view (not a copy) of the pandas DataFrame used by the handler module to manage the annotations.

Args:
label: int or list(int)

Get only annotations with this label

id: int or tuple

Unique identifier of the annotation subset. If None is specified, all annotations are returned.

squeeze: bool

If the handler is managing a single annotation set, drop the 0th-level index. Default is True.

drop_freq: bool

Drop the frequency columns.

key_error: bool

If set to True, return error if the specified annotation set does not exist. If set to False, return None. Default is False.

Returns:
ans: pandas DataFrame

Annotations

Example:
>>> from ketos.audio.annotation import AnnotationHandler
>>> # Initialize an empty instance of the annotation handler
>>> handler = AnnotationHandler()
>>> # Add a couple of annotations
>>> handler.add(label=1, start='1min', end='2min')
>>> handler.add(label=2, start='11min', end='12min')
>>> # Retrieve the annotations
>>> annot = handler.get()
>>> print(annot)
   label  start    end  freq_min  freq_max
0      1   60.0  120.0       NaN       NaN
1      2  660.0  720.0       NaN       NaN
>>> # Retrieve only annotations with label 2
>>> annot = handler.get(label=2)
>>> print(annot)
   label  start    end  freq_min  freq_max
1      2  660.0  720.0       NaN       NaN
num_annotations(id=None)[source]

Get number of annotations managed by the handler.

Returns:
num: int or tuple

Unique identifier of the annotation set. If None is specified, the total number of annotations is returned.

num_sets()[source]

Get number of annotation subsets managed by the handler.

Returns:
num: int

Number of annotation sets

segment(num_segs, window, step=None, offset=0)[source]

Divide the time axis into segments of uniform length, which may or may not be overlapping.

Args:
num_segs: int

Number of segments

window: float or str

Duration of each segment. Can be specified either as a float, in which case the unit will be assumed to be seconds, or as a string with an SI unit, for example, ‘22min’

step: float or str

Step size. Can be specified either as a float, in which case the unit will be assumed to be seconds, or as a string with an SI unit, for example, ‘22min’. If no value is specified, the step size is set equal to the window size, implying non-overlapping segments.

offset: float or str

Start time for the first segment. Can be specified either as a float, in which case the unit will be assumed to be seconds, or as a string with an SI unit, for example, ‘22min’. Negative times are permitted.

Returns:
ans: AnnotationHandler
Stacked annotation handler with three levels of indexing where
  • level 0: annotation set

  • level 1: segment

  • level 2: individual annotation

Example:
>>> from ketos.audio.annotation import AnnotationHandler
>>> # Initialize an empty annotation handler
>>> handler = AnnotationHandler()
>>> # Add a couple of annotations
>>> handler.add(label=1, start='1s', end='3s')
>>> handler.add(label=2, start='5.2s', end='7.0s')
>>> # Apply segmentation
>>> handler = handler.segment(num_segs=10, window='1s', step='0.8s', offset='0.1s')
>>> # Inspect the annotations
>>> annots = handler.get(drop_freq=True)
>>> print(annots)
     label  start  end
0 0      1    0.9  1.0
1 0      1    0.1  1.0
2 0      1    0.0  1.0
3 0      1    0.0  0.5
6 1      2    0.3  1.0
7 1      2    0.0  1.0
8 1      2    0.0  0.5
>>> # Note the double index, where the first index refers to the segment 
>>> # while the second index referes to the original annotation.
>>> # We can get the annotations for a single segment like this,
>>> annots3 = handler.get(id=3, drop_freq=True)
>>> print(annots3)
   label  start  end
0      1    0.0  0.5
>>> # If we attempt to retrieve annotations for a segment that does not 
>>> # have any annotations, we get None,
>>> annots4 = handler.get(id=4, drop_freq=True)
>>> print(annots4)
None
set_ids()[source]

Get the IDs of the annotation subsets managed by the handler.

Returns:
: numpy array

IDs of the annotation sets

shift(delta_time=0)[source]

Shift all annotations by a fixed amount along the time dimension.

If the shift places some of the annotations (partially) before time zero, these annotations are removed or cropped.

Args:
delta_time: float or str

Amount by which annotations will be shifted. Can be specified either as a float, in which case the unit will be assumed to be seconds, or as a string with an SI unit, for example, ‘22min’

Example:

ketos.audio.annotation.add_index_level(df, key=0)[source]

Ensure the DataFrame has at least two indexing levels.

Args:
df: pandas DataFrame

Input DataFrame

Returns:
df: pandas DataFrame

Output DataFrame

ketos.audio.annotation.convert(x, unit)[source]

Convert a quantity specified as a string with SI units, e.g. “7kg” to a float with the specified unit, e.g. ‘g’.

If the input is not a string, the output will be the same as the input.

Args:
x: str

Value given as a string with SI units, e.g. “11kHz”

unit: str

Desired conversion unit “Hz”

Returns:
yfloat

Value in specified unit.

ketos.audio.annotation.convert_to_Hz(x)[source]

Convert a frequency specified as a string with SI units, e.g. “11kHz” to a float with units of Hz.

Args:
x: str

Frequency specified as a string with SI units, e.g. “11kHz”

Returns:
: float

Frequency in Hz.

ketos.audio.annotation.convert_to_sec(x)[source]

Convert a time duration specified as a string with SI units, e.g. “22min” to a float with units of seconds.

Args:
x: str

Time duration specified as a string with SI units, e.g. “22min”

Returns:
: float

Time duration in seconds.

ketos.audio.annotation.stack_annotations(handlers, keys=None, level=0)[source]

Create a handler to manage a stack of annotation sets.

The annotation sets will be indexed in the order they are provided.

Args:
handlers: list(AnnotationHandler)

Annotation handlers

keys: list

Keys for indexing the sets. If None is specified, the keys are set to 0,1,2,…

level: int

Set index level. Default is 0.

Returns:
handler: AnnotationHandler

Stacked annotation handler