Selection Table

selection_table module within the ketos library.

This module provides functions for handling annotation tables and creating selection tables.

A Ketos annotation table always has the column ‘label’. For call-level annotations, the table also contains the columns ‘start’ and ‘end’, giving the start and end time of the call measured in seconds since the beginning of the file. The table may also contain the columns ‘freq_min’ and ‘freq_max’, giving the minimum and maximum frequencies of the call in Hz, but this is not required. The user may add any number of additional columns. Note that the table uses two levels of indices, the first index being the filename and the second index being an integer to identify annotations pertaining to the same file.

Here is a minimal example of an annotation table,

label

filename

annot_id

file1.wav

0

2

1

1

2

2

file2.wav

0

2

1

2

2

1

And here is a more extensive example with time information (call-level annotations) and a few extra columns (‘min_freq’, ‘max_freq’ and ‘file_time_stamp’),

start

end

label

min_freq

max_freq

file_time_stamp

filename

annot_id

file1.wav

0

7.0

8.1

2

180.6

294.3

2019-02-24 13:15:00

1

8.5

12.5

1

174.2

258.7

2019-02-24 13:15:00

2

13.1

14.0

2

183.4

292.3

2019-02-24 13:15:00

file2.wav

0

2.2

3.1

2

148.8

286.6

2019-02-24 13:30:00

1

5.8

6.8

2

156.6

278.3

2019-02-24 13:30:00

2

9.0

13.0

1

178.2

304.5

2019-02-24 13:30:00

Ketos selection tables also use two level of indices. The first index is a unique, integer identifier, while the second index is the filename. Moreover, selection tables always contain the columns ‘start’ and ‘end’ giving the start and end time of the selection window measured in seconds since the beginning of the file. This structure allows selections to span multiple files. The user may add any number of additional columns to a selection table.

Here is a minimal example of a selection table,

start

end

sel_id

filename

0

file1.wav

1.5

4.5

1

file1.wav

12.0

15.0

file2.wav

0.0

5.0

2

file2.wav

2.0

10.0

3

file2.wav

7.0

15.0

unfold(table[, sep])

Unfolds rows containing multiple labels.

rename_columns(table, mapper)

Renames the table headings to conform with the ketos naming convention.

empty_annot_table()

Create an empty call-level annotation table

standardize([annotations, sep, labels, ...])

Standardize the annotation table format.

use_multi_indexing(df, level_1_name)

Change from single-level indexing to double-level indexing.

trim(table[, extra_cols])

Keep only the columns prescribed by the Ketos annotation format.

is_standardized(table[, has_time, verbose])

Check if the table has the correct indices and the minimum required columns.

label_occurrence(table)

Identify the unique labels occurring in the table and determine how often each label occurs.

cast_to_str(labels[, nested])

Convert every label to str format.

select(annotations, length[, step, ...])

Generate a selection table by defining intervals of fixed length around annotated sections of the audio data.

time_shift(annot, time_ref, length, step, ...)

Create multiple instances of the same selection by stepping in time, both forward and backward.

file_duration_table(path[, search_subdirs, ...])

Create file duration table.

create_rndm_selections(files, length, num[, ...])

Create selections of uniform length, randomly distributed across the data set and not overlapping with any annotations.

random_choice(df, siz)

Randomly select a specified number of elements from a table.

select_by_segmenting(files, length[, ...])

Generate a selection table by stepping across the audio files, using a fixed step size (step) and fixed selection window size (length).

segment_files(table, length[, step, pad])

Generate a selection table by stepping across the audio files, using a fixed step size (step) and fixed selection window size (length).

segment_annotations(table, num, length[, ...])

Generate a segmented annotation table by stepping across the audio files, using a fixed step size (step) and fixed selection window size (length).

query(selections[, annotations, filename, ...])

Query selection table for selections from certain audio files and/or with certain labels.

query_labeled(table[, filename, label, ...])

Query selection table for selections from certain audio files and/or with certain labels.

query_annotated(selections, annotations[, ...])

Query selection table for selections from certain audio files and/or with certain labels.

aggregate_duration(table[, label])

Compute the aggregate duration of the annotations.