Digest 2022-01 # Speech AI that understands speech by lookin | Spark in me

Digest 2022-01

# Speech

AI that understands speech by looking as well as hearing - https://ai.facebook.com/blog/ai-that-understands-speech-by-looking-as-well-as-hearing

HuBERT: Self-supervised representation learning for speech recognition, generation, and compression - https://ai.facebook.com/blog/hubert-self-supervised-representation-learning-for-speech-recognition-generation-and-compression

# ML

Графовые нейронные сети - https://dyakonov.org/2021/12/30/gnn/
A Gentle Introduction to Graph Neural Networks - https://distill.pub/2021/gnn-intro/
GPT-3, Foundation Models, and AI Nationalism - https://lastweekin.ai/p/gpt-3-foundation-models-and-ai-nationalism
The Illustrated Retrieval Transformer - https://jalammar.github.io/illustrated-retrieval-transformer/
You get what you measure: New NLU benchmarks for few-shot learning and robustness evaluation - https://www.microsoft.com/en-us/research/blog/you-get-what-you-measure-new-nlu-benchmarks-for-few-shot-learning-and-robustness-evaluation/
Azure AI milestone: New foundation model Florence v1.0 advances state of the art, topping popular computer vision leaderboards - https://www.microsoft.com/en-us/research/blog/azure-ai-milestone-new-foundation-model-florence-v1-0-pushing-vision-and-vision-language-state-of-the-art/
Language modelling at scale: Gopher, ethical considerations, and retrieval - https://deepmind.com/blog/article/language-modelling-at-scale
Sequence-to-sequence learning with Transducers - https://lorenlugosch.github.io/posts/2020/11/transducer/
A contemplation of logsumexp - https://lorenlugosch.github.io/posts/2020/06/logsumexp/
Meta claims its AI improves speech recognition quality by reading lips - https://venturebeat.com/2022/01/07/meta-claims-its-ai-improves-speech-recognition-quality-by-reading-lips/
Training 100B models is fucking hard - https://github.com/bigscience-workshop/bigscience/blob/master/train/lessons-learned.md
Scaling Vision with Sparse Mixture of Experts - https://ai.googleblog.com/2022/01/scaling-vision-with-sparse-mixture-of.html
Интерпретация моделей и диагностика сдвига данных: LIME, SHAP и Shapley Flow - https://habr.com/ru/company/ods/blog/599573/
A ConvNet for the 2020s - https://arxiv.org/pdf/2201.03545.pdf
LaMDA: Towards Safe, Grounded, and High-Quality Dialog Models for Everything - https://ai.googleblog.com/2022/01/lamda-towards-safe-grounded-and-high.html
Separating Birdsong in the Wild for Classification - https://ai.googleblog.com/2022/01/separating-birdsong-in-wild-for.html
Accurate Alpha Matting for Portrait Mode Selfies on Pixel 6 - https://ai.googleblog.com/2022/01/accurate-alpha-matting-for-portrait.html
The Gradient Update #16: China's World-leading Surveillance Research and a ConvNet for the 2020s - https://thegradientpub.substack.com/p/the-gradient-update-16-chinas-world
Does Gradient Flow Over Neural Networks Really Represent Gradient Descent? - http://www.offconvex.org/2022/01/06/gf-gd/
Does Your Medical Image Classifier Know What It Doesn’t Know? - https://ai.googleblog.com/2022/01/does-your-medical-image-classifier-know.html
Introducing Text and Code Embeddings in the OpenAI API - https://openai.com/blog/introducing-text-and-code-embeddings/
Steering Towards Effective Autonomous Vehicle Policy - https://thegradient.pub/engaging-with-disengagement/

Introducing StylEx: A New Approach for Visual Explanation of Classifiers
- https://ai.googleblog.com/2022/01/introducing-stylex-new-approach-for.html
-

- tldr very cool, but most likely requires a lot of compute

Spark in me

👨‍🦯 2.68K
Технологии

Lost like tears in rain. DS, ML, a bit of philosophy and math. No bs or ads.

Join
▲ Vote (1)

Digest 2022-01 # Speech AI that understands speech by lookin | Spark in me

Login