diff options
| author | Jim Tittsler <[email protected]> | 2023-12-14 11:55:49 +0900 |
|---|---|---|
| committer | Jim Tittsler <[email protected]> | 2023-12-14 11:55:49 +0900 |
| commit | d104a5019cbbb7e46f532ccc760f588bc5ef2ae7 (patch) | |
| tree | 950c08ae038e7dd96ee36fe552a0d962a249dfa1 /doc/codec2.tex | |
| parent | ba1c314ab26651bf99518f465c55652252bc556a (diff) | |
Revert s->z spelling changes
Diffstat (limited to 'doc/codec2.tex')
| -rw-r--r-- | doc/codec2.tex | 6 |
1 files changed, 3 insertions, 3 deletions
diff --git a/doc/codec2.tex b/doc/codec2.tex index 4bbffee..310db85 100644 --- a/doc/codec2.tex +++ b/doc/codec2.tex @@ -200,7 +200,7 @@ Often the errors interact, for example the fine pitch error shown above will mea \begin{center} \begin{tikzpicture}[auto, node distance=2cm,>=triangle 45,x=1.0cm,y=1.0cm,align=center,text width=2cm] -\node [input] (input) {}; +\node [input] (rinput) {}; \node [block, right of=rinput,node distance=2cm] (dequantise) {Dequantise Interpolate}; \node [block, right of=dequantise,node distance=3cm] (recover) {Recover Amplitudes}; \node [block, right of=recover,node distance=3cm] (synthesise) {Synthesise Speech}; @@ -230,7 +230,7 @@ Table \ref{tab:bit_allocation} presents the bit allocation for two popular Codec At very low bit rates such as 700 bits/s, we use Vector Quantisation (VQ) to represent the spectral amplitudes. We construct a table such that each row of the table has a set of spectral amplitude samples. In Codec 2 700C the table has 512 rows. During the quantisation process, we choose the table row that best matches the spectral amplitudes for this frame, then send the \emph{index} of the table row. The decoder has a similar table, so can use the index to look up the spectral amplitude values. If the table is 512 rows, we can use a 9 bit number to quantise the spectral amplitudes. In Codec 2 700C, we use two tables of 512 entries each (18 bits total), the second one helps fine tune the quantisation from the first table. -Vector Quantisation can only represent what is present in the tables, so if it sees anything unusual (for example, a different microphone frequency response or background noise), the quantization can become very rough and speech quality poor. We train the tables at design time using a database of speech samples and a training algorithm - an early form of machine learning. +Vector Quantisation can only represent what is present in the tables, so if it sees anything unusual (for example, a different microphone frequency response or background noise), the quantisation can become very rough and speech quality poor. We train the tables at design time using a database of speech samples and a training algorithm - an early form of machine learning. Codec 2 3200 uses the method of fitting a filter to the spectral amplitudes, this approach tends to be more forgiving of small variations in the input speech spectrum, but is not as efficient in terms of bit rate. @@ -345,7 +345,7 @@ r &= \frac{\omega_0 N_{dft}}{2 \pi} \end{equation} The DFT indexes $a_m, b_m$ select the band of $S_w(k)$ containing the $m$-th harmonic; $r$ maps the harmonic number $m$ to the nearest DFT index, and $\lfloor x \rceil$ is the rounding operator. This method of estimating $A_m$ is relatively insensitive to small errors in $F0$ estimation and works equally well for voiced and unvoiced speech. Figure $\ref{fig:hts2a_time}$ plots $S_w$ (blue) and $\{A_m\}$ (red) for a sample frame of female speech. -The phase is sampled at the centre of the band. For all practical Codec 2 modes, the phase is not transmitted to the decoder, so it does not need to be computed. However, speech synthesized using the phase is useful as a control during development and is available using the \emph{c2sim} utility. +The phase is sampled at the centre of the band. For all practical Codec 2 modes, the phase is not transmitted to the decoder, so it does not need to be computed. However, speech synthesised using the phase is useful as a control during development and is available using the \emph{c2sim} utility. \subsection{Sinusoidal Synthesis} |
