cqt
- ketos.audio.utils.misc.cqt(x, rate, step, bins_per_oct, freq_min, freq_max=None, window_func='hamming')[source]
Compute the CQT spectrogram of an audio signal.
Uses the librosa implementation,
To compute the CQT spectrogram, the user must specify the step size, the minimum and maximum frequencies, and , and the number of bins per octave, . While and are fixed to the input values, the step size and are adjusted as detailed below, attempting to match the input values as closely as possible.
The total number of bins is given by where denotes the number of octaves, computed as
For example, with , , and the number of octaves is and the total number of bins is . The frequency of a given bin, , is given by
This implies that the maximum frequency is given by . For the above example, we find Hz, i.e., somewhat larger than the requested maximum value.
Note that if exceeds the Nyquist frequency, , where is the sampling rate, the number of octaves, , is reduced to ensure that .
The CQT algorithm requires the step size to be an integer multiple . To ensure that this is the case, the step size is computed as follows,
where is the sampling rate in Hz, and is the step size in seconds as specified via the argument winstep. For example, assuming a sampling rate of 32 kHz () and a step size of 0.02 seconds () and adopting the same frequency limits as above ( and ), the actual step size is determined to be , corresponding to a physical bin size of , i.e., about three times as large as the requested step size.
TODO: If possible, remove librosa dependency
- Args:
- x: numpy.array
Audio signal
- rate: float
Sampling rate in Hz
- step: float
Step size in seconds
- bins_per_oct: int
Number of bins per octave
- freq_min: float
Minimum frequency in Hz
- freq_max: float
Maximum frequency in Hz. If None, it is set equal to half the sampling rate.
- window_func: str
- Window function (optional). Select between
bartlett
blackman
hamming (default)
hanning
- Returns:
- img: numpy.array
Resulting CQT spectrogram image.
- step: float
Adjusted step size in seconds.