Получи случайную криптовалюту за регистрацию!

Silero English STT Models V6 We have published the new en | Spark in me

Silero English STT Models V6

We have published the new en_v6 speech-to-text models

Please see the metrics here

A large number of new validation datasets added for dialects and VOIP

The model family now includes variations of small and xlarge models

Single digit quality gains both for CE and EE models, the gains are less pronounced with EE models

Best gains reserved for xsmall models, which will not be public for the time being and have almost reached small models in terms of quality, but are 2x smaller (14M params)

The models seem to be fit quite well on the data, but the returns are diminishing compared to V3 => V4 => V5. We are already investigating new radical ways to make the models better, stay tuned

Also we have started working on packaging the utils for the public Silero models in a pip package (will work similarly to torch.hub.load)