AudioWriter

class ketos.data_handling.database_interface.AudioWriter(output_file, max_size=1000000000.0, verbose=False, mode='w', discard_wrong_shape=False, allow_resizing=1, include_source=True, include_label=True, include_attrs=True, max_filename_len=100, data_name=None, index_cols=None, create_dir=True, annot_type=None, table_path='/', table_name='audio')[source]

Saves waveform or spectrogram objects to a database file (.h5).

If the combined size of the saved data exceeds max_size (1 GB by default), the output database file will be split into several files, with _000, _001, etc, appended to the filename.

Args:
output_file: str

Full path to output database file (.h5)

max_size: int

Maximum size of output database file in bytes. If file exceeds this size, it will be split up into several files with _000, _001, etc, appended to the filename. The default values is max_size=1E9 (1 Gbyte). If None, no restriction is imposed on the file size (i.e. the file is never split).

verbose: bool

Print relevant information during execution such as no. of files written to disk

mode: str
The mode to open the file. It can be one of the following:

w: Write; a new file is created (an existing file with the same name would be deleted). This is the default. a: Append; an existing file is opened for reading and writing, and if the file does not exist it is created. r+: It is similar to a, but the file must already exist.

discard_wrong_shape: bool

Discard objects that do not have the same shape as previously saved objects. Default is False.

allow_resizing: int

If the object shape differs from previously saved objects, the object will be resized using the resize method of the scikit-image package, provided the mismatch is no greater than allow_resizing in either dimension.

include_source: bool

If True, the name of the wav file from which the waveform or spectrogram was generated and the offset within that file, is saved to the table. Default is True.

include_label: bool

Include integer label column in data table. Default is True.

include_attrs: bool

If True, attributes returned by the get_instance_attrs() method will also be saved to the table. Default is True.

max_filename_len: int

Maximum allowed length of filename. Only used if include_source is True.

data_name: str or list(str)

Name(s) of the data columns. If None is specified, the data column is named ‘data’, or ‘data0’, ‘data1’, … if the table contains multiple data columns.

create_dir: bool

If the output directory does not exist, it will be automatically created. Default is True. Only applies if the mode is w or a,

annot_type: str

Specify the annotation type. Options are weak and strong. If not specified, the type will be inferred from the first instance to be written to the database file. For weakly labelled data, a extra column named label is include in the data table. For strongly labelled data, the annotations are saved to a separate table.

table_path: str

Path to the group containing the table

table_name: str

Name of the table

Attributes:
base: str

Output filename base

ext: str

Output filename extension (.h5)

file: tables.File

Database file

file_counter: int

Keeps track of how many files have been written to disk

item_counter: int

Keeps track of how many audio objects have been written to files

path: str

Path to table within database filesystem

name: str

Name of table

max_size: int

Maximum size of output database file in bytes If file exceeds this size, it will be split up into several files with _000, _001, etc, appended to the filename. The default values is max_size=1E9 (1 Gbyte). Disabled if writing in ‘append’ mode.

verbose: bool

Print relevant information during execution such as files written to disk

mode: str
The mode to open the file. It can be one of the following:

w: Write; a new file is created (an existing file with the same name would be deleted). a: Append; an existing file is opened for reading and writing, and if the file does not exist it is created. r+: It is similar to a, but the file must already exist.

discard_wrong_shape: bool

Discard objects that do not have the same shape as previously saved objects. Default is False.

allow_resizing: int

If the object shape differs from previously saved objects, the object will be resized using the resize method of the scikit-image package, provided the mismatch is no greater than allow_resizing in either dimension.

num_ignore: int

Number of ignored objects

data_shape: tuple

Data shape

include_source: bool

If True, the name of the wav file from which the waveform or spectrogram was generated and the offset within that file, is saved to the table. Default is True.

include_label: bool

Include integer label column in data table. Default is True.

include_attrs: bool

If True, attributes returned by the get_instance_attrs() method will also be saved to the table. Default is True.

filename_len: int

Maximum allowed length of filename. Only used if include_source is True.

data_name: str or list(str)

Name(s) of the data columns. If None is specified, the data column is named ‘data’, or ‘data0’, ‘data1’, … if the table contains multiple data columns.

index_cols: str og list(str)

Create indices for the specified columns in the data table to allow for faster queries. For example, index_cols=”filename” or index_cols=[“filename”, “label”]

create_dir: bool

If the output directory does not exist, it will be automatically created. Default is True. Only applies if the mode is w or a,

annot_type: str

Annotation type. Options are weak and strong. If not specified, the type will be inferred from the first instance to be written to the database file. For strongly labelled data, the annotations are saved to a separate table.

Methods

close([final])

Close the currently open database file, if any

set_table(path, name)

Change the current table

write(x[, path, name])

Write waveform or spectrogram object to a table in the database file

write_attr(attr_name, attr_value[, path, name])

Write attribute to a table in the database file

close(final=True)[source]

Close the currently open database file, if any

Args:
final: bool

If True, this instance of AudioWriter will not be able to save more spectrograms to file

set_table(path, name)[source]

Change the current table

Args:
path: str

Path to the group containing the table

name: str

Name of the table

write(x, path=None, name=None)[source]

Write waveform or spectrogram object to a table in the database file

If path and name are not specified, the object will be saved to the current directory (as set with the cd() method).

Args:
x: instance of BaseAudio or list

Object(s) to be saved

path: str

Path to the group containing the table

name: str

Name of the table

write_attr(attr_name, attr_value, path=None, name=None)[source]

Write attribute to a table in the database file

If path and name are not specified, the object will be saved to the current directory (as set with the cd() method).

See https://www.pytables.org/usersguide/libref/declarative_classes.html#the-attributeset-class for details on how various Python types are saved as attributes to HDF5 tables.

Args:
attr_name: str

Attribute name

attr_value:

Value to be saved

path: str

Path to the group containing the table

name: str

Name of the table