AudioWriter
- class ketos.data_handling.database_interface.AudioWriter(output_file, max_size=1000000000.0, verbose=False, mode='w', discard_wrong_shape=False, allow_resizing=1, include_source=True, include_label=True, include_attrs=True, max_filename_len=100, data_name=None, index_cols=None, create_dir=True, annot_type=None, table_path='/', table_name='audio')[source]
Saves waveform or spectrogram objects to a database file (.h5).
If the combined size of the saved data exceeds max_size (1 GB by default), the output database file will be split into several files, with _000, _001, etc, appended to the filename.
- Args:
- output_file: str
Full path to output database file (.h5)
- max_size: int
Maximum size of output database file in bytes. If file exceeds this size, it will be split up into several files with _000, _001, etc, appended to the filename. The default values is max_size=1E9 (1 Gbyte). If None, no restriction is imposed on the file size (i.e. the file is never split).
- verbose: bool
Print relevant information during execution such as no. of files written to disk
- mode: str
- The mode to open the file. It can be one of the following:
w: Write; a new file is created (an existing file with the same name would be deleted). This is the default. a: Append; an existing file is opened for reading and writing, and if the file does not exist it is created. r+: It is similar to a, but the file must already exist.
- discard_wrong_shape: bool
Discard objects that do not have the same shape as previously saved objects. Default is False.
- allow_resizing: int
If the object shape differs from previously saved objects, the object will be resized using the resize method of the scikit-image package, provided the mismatch is no greater than allow_resizing in either dimension.
- include_source: bool
If True, the name of the wav file from which the waveform or spectrogram was generated and the offset within that file, is saved to the table. Default is True.
- include_label: bool
Include integer label column in data table. Default is True.
- include_attrs: bool
If True, attributes returned by the get_instance_attrs() method will also be saved to the table. Default is True.
- max_filename_len: int
Maximum allowed length of filename. Only used if include_source is True.
- data_name: str or list(str)
Name(s) of the data columns. If None is specified, the data column is named ‘data’, or ‘data0’, ‘data1’, … if the table contains multiple data columns.
- create_dir: bool
If the output directory does not exist, it will be automatically created. Default is True. Only applies if the mode is w or a,
- annot_type: str
Specify the annotation type. Options are weak and strong. If not specified, the type will be inferred from the first instance to be written to the database file. For weakly labelled data, a extra column named label is include in the data table. For strongly labelled data, the annotations are saved to a separate table.
- table_path: str
Path to the group containing the table
- table_name: str
Name of the table
- Attributes:
- base: str
Output filename base
- ext: str
Output filename extension (.h5)
- file: tables.File
Database file
- file_counter: int
Keeps track of how many files have been written to disk
- item_counter: int
Keeps track of how many audio objects have been written to files
- path: str
Path to table within database filesystem
- name: str
Name of table
- max_size: int
Maximum size of output database file in bytes If file exceeds this size, it will be split up into several files with _000, _001, etc, appended to the filename. The default values is max_size=1E9 (1 Gbyte). Disabled if writing in ‘append’ mode.
- verbose: bool
Print relevant information during execution such as files written to disk
- mode: str
- The mode to open the file. It can be one of the following:
w: Write; a new file is created (an existing file with the same name would be deleted). a: Append; an existing file is opened for reading and writing, and if the file does not exist it is created. r+: It is similar to a, but the file must already exist.
- discard_wrong_shape: bool
Discard objects that do not have the same shape as previously saved objects. Default is False.
- allow_resizing: int
If the object shape differs from previously saved objects, the object will be resized using the resize method of the scikit-image package, provided the mismatch is no greater than allow_resizing in either dimension.
- num_ignore: int
Number of ignored objects
- data_shape: tuple
Data shape
- include_source: bool
If True, the name of the wav file from which the waveform or spectrogram was generated and the offset within that file, is saved to the table. Default is True.
- include_label: bool
Include integer label column in data table. Default is True.
- include_attrs: bool
If True, attributes returned by the get_instance_attrs() method will also be saved to the table. Default is True.
- filename_len: int
Maximum allowed length of filename. Only used if include_source is True.
- data_name: str or list(str)
Name(s) of the data columns. If None is specified, the data column is named ‘data’, or ‘data0’, ‘data1’, … if the table contains multiple data columns.
- index_cols: str og list(str)
Create indices for the specified columns in the data table to allow for faster queries. For example, index_cols=”filename” or index_cols=[“filename”, “label”]
- create_dir: bool
If the output directory does not exist, it will be automatically created. Default is True. Only applies if the mode is w or a,
- annot_type: str
Annotation type. Options are weak and strong. If not specified, the type will be inferred from the first instance to be written to the database file. For strongly labelled data, the annotations are saved to a separate table.
Methods
close
([final])Close the currently open database file, if any
set_table
(path, name)Change the current table
write
(x[, path, name])Write waveform or spectrogram object to a table in the database file
write_attr
(attr_name, attr_value[, path, name])Write attribute to a table in the database file
- close(final=True)[source]
Close the currently open database file, if any
- Args:
- final: bool
If True, this instance of AudioWriter will not be able to save more spectrograms to file
- set_table(path, name)[source]
Change the current table
- Args:
- path: str
Path to the group containing the table
- name: str
Name of the table
- write(x, path=None, name=None)[source]
Write waveform or spectrogram object to a table in the database file
If path and name are not specified, the object will be saved to the current directory (as set with the cd() method).
- Args:
- x: instance of BaseAudio or list
Object(s) to be saved
- path: str
Path to the group containing the table
- name: str
Name of the table
- write_attr(attr_name, attr_value, path=None, name=None)[source]
Write attribute to a table in the database file
If path and name are not specified, the object will be saved to the current directory (as set with the cd() method).
See https://www.pytables.org/usersguide/libref/declarative_classes.html#the-attributeset-class for details on how various Python types are saved as attributes to HDF5 tables.
- Args:
- attr_name: str
Attribute name
- attr_value:
Value to be saved
- path: str
Path to the group containing the table
- name: str
Name of the table