Tutorial: Ketos Metrics

The ketos-metrics module, part of the Ketos commands suite, simplifies the process of evaluating results obtained from running the ketos-run command to obtain a series of detecitons.

Quick Start

This quick start guide will demonstrate how to generate performance metrics for a set of detections, comparing them against a ground truth of annotations. You can use the detection results obtained from the ketos-run tutorial.

Alternatively, we have already prepared a set of detections and annotations for your convenience. You can find these resources at the following link: Metrics Data Set. This dataset is designed to help you quickly get started with the ketos-metrics module without needing to generate your own detections and annotations from scratch.

The ketos-metrics module adopts a one-vs-all approach for calculating metrics and will list performance metrics for each class, including their micro and macro averages.

Navigate to the directory containing your detections and annotations files, and then execute the following commands based on your operating system:

For Windows:

ketos-metrics detections_continuous.csv annotations_continuous.csv --output_folder metrics

For Linux / Mac:

ketos-metrics detections_continuous.csv annotations_continuous.csv --output_folder metrics

Upon execution, the command saves the results in two new files: metrics.csv and results.csv, both located inside a folder named ‘metrics’.

  • The metrics.csv file contains performance metrics like precision, recall, and F1 score for each class at various detection thresholds.

  • The results.csv file details the True Positives (TP), False Positives (FP), and False Negatives (FN) for each class and threshold.

Note

For a more detailed walkthrough, please refer to the Examples section.

The ketos-metrics command has a few parameters that can be used to adjust its bahavior.

ketos-metrics -h

ketos_metrics.py [-h] [--type TYPE] [--threshold_min THRESHOLD_MIN]
                 [--threshold_max THRESHOLD_MAX] [--threshold_inc THRESHOLD_INC]
                 [--total_time_units TOTAL_TIME_UNITS] [--output_folder OUTPUT_FOLDER]
                 [--add_background_reference ADD_BACKGROUND_REFERENCE]
                 evaluation reference

positional arguments:

evaluation        Path to the .csv file containing evaluation results, which may include detection scores or annotations.
reference         Path to the .csv file containing ground truth reference.

options:

-h, --help        show this help message and exit
--type TYPE       Type of evaluation: "clips" for short clips or "continuous" for long continuous files. Default is 'continuous'.
--threshold_min THRESHOLD_MIN
                  Minimum threshold for detection. Default is 0.
--threshold_max THRESHOLD_MAX
                  Maximum threshold for detection. Default is 1.
--threshold_inc THRESHOLD_INC
                  Threshold increment for each step. Default is 0.05.
--total_time_units TOTAL_TIME_UNITS
                  The total duration in arbitrary time units over which the detections were made (e.g., hours, minutes). This parameter is flexible, allowing any unit of time to be used. The ketos-metrics command is designed to be agnostic to the specific unit of time chosen, whether it be hours, minutes, or another measure. required for the FPR.
--output_folder OUTPUT_FOLDER
                  Location to output the performance results. For example: metrics/
--add_background_reference ADD_BACKGROUND_REFERENCE
                  Create background reference for audio files given a set of existing annotations. The 'reference' annotations will be updated to include the new annotations. Pass two parameters: [path_to_audio_folder, label]. Only relevant for type 'continuous'.

Examples

Example 1: Basic Evaluation

Just like ketos-run, ketos-metrics is capable of working with the output from clip detections, as opposed to only processing continuous files. To adapt the command for clip detections, you simply need to modify the –type parameter to ‘clips’.

ketos-metrics detections_clips.csv annotations_clips.csv --type clips --output_folder metrics
  • detections_clips.csv: This is the first argument, representing the path to the .csv file that contains detection results from the ketos-run command. These results are specifically for clipped audio segments.

  • annotations_clips.csv: This is the second argument, denoting the path to the .csv file containing the ground truth annotations for the clipped audio segments. This file is used as a reference for evaluating the accuracy of the detections.

  • –type clips: This option specifies the type of evaluation to be performed. Setting this to ‘clips’ indicates that the input files are based on discrete audio clips, as opposed to continuous long recordings.

  • –output_folder metrics: This parameter defines the destination folder where the evaluation results will be stored. In this case, the results will be saved in a folder named ‘metrics’.

By setting the –type parameter to ‘clips’, the ketos-metrics command will specifically handle and analyze detections as discrete, individual clips. This mode should be used when your data, and therefore your detections, consists of shorter, segmented audio clips rather than long, continuous recordings.

Example 2: Multi-Class Evaluation

In multi-class evaluation, ketos-metrics analyzes the performance of a model across multiple classes. The command remains the same as for binary detection, and the output includes performance metrics for each class individually, following a one-vs-all approach.

For this part, we will be using the detections_continuous_multi_class and annotations_continuous_multi_class csv files.

ketos-metrics detections_continuous_multi_class.csv annotations_continuous_multi_class.csv --type continuous --output_folder metrics
  • detections_clips.csv: The CSV file containing detection results, potentially including multiple classes.

  • annotations_clips.csv: The ground truth annotations file, with labels corresponding to the different classes.

  • –type clips: Indicates that the input files contain long, continuous recordings.

  • –output_folder metrics: Specifies where to save the evaluation results.

The output files (metrics.csv and results.csv) will provide detailed metrics (precision, recall, F1 score, etc.) for each class.

Example 3: Using add_background_reference Option

The add_background_reference allows the user to include background noise as a reference class in the evaluation.

ketos-metrics detections_for_add_background_reference.csv annotations_continuous.csv --type continuous --output_folder metrics --add_background_reference audio_list.txt 0 audio/
  • –add_background_reference audio_list.txt 0 audio/: This option instructs ketos-metrics to create a background reference. The first parameter (audio_list.txt) specifies the path to a text file containing a list of audio file paths. The second parameter (0) is the label assigned to the background noise. The optional third parameter (audio/) is the path to the root folder containing audio files, which will be prepended to each path in the file list if provided.

Usage Examples:

  1. Basic Usage: Command: ketos-metrics detections.csv annotations.csv –type continuous –output_folder metrics –add_background_reference audio_list.txt 0 - audio_list.txt contains file paths like sample1.wav, sample2.wav. - Annotations created for these files will have filenames like sample1.wav, sample2.wav in the annotations file.

  2. Using Root Audio Folder: Command: ketos-metrics detections.csv annotations.csv –type continuous –output_folder metrics –add_background_reference audio_list.txt 0 audio/ - audio_list.txt contains file names like sample1.wav. - The root audio folder audio/ is prepended to each path from audio_list.txt. - Annotations will be named like audio/sample1.wav.

  3. Subfolders in File List: Command: ketos-metrics detections.csv annotations.csv –type continuous –output_folder metrics –add_background_reference audio_list.txt 0 audio/ - audio_list.txt contains paths like subfolder1/sample1.wav, subfolder2/sample2.wav. - The root audio folder audio/ is prepended to these paths. - Annotations will be named like audio/subfolder1/sample1.wav, audio/subfolder2/sample2.wav.

  4. File List with Subpaths and No Root Audio Folder: Command: ketos-metrics detections.csv annotations.csv –type continuous –output_folder metrics –add_background_reference audio_list.txt 0 - audio_list.txt contains paths like audio/sample1.wav, audio/sample2.wav. - Annotations will be named as they appear in the file list, e.g., audio/sample1.wav.

Important Note: The user needs to ensure that the style of filenames in the annotations created by –add_background_reference and those in the existing annotation CSV match. The root_audio_folder parameter can be utilized to align the paths by specifying part of the path. This ensures consistency in the annotation filenames, which is crucial for correctly associating annotations with their respective audio files for evaluation.

This option will not be relevant in most cases as the number of background segments far outnumber the number of actual signals. For isntance in the above example you may notice that we got a precision and recall of 1 for the ‘0’ class.

Note: The add_background_reference option is only relevant for continuous file types and is not used when evaluating clipped audio segments (--type clips).