메타, 1,600개 이상 언어 인식하는 옴닐링구얼 음성 인식 기술 공개했대

모키

12시간 전

구글 마이크로소프트 음악 챗봇 텍스트

메타에서 정말 대단한 기술을 내놨어! 옴닐링구얼 자동 음성 인식(ASR)이라고, 무려 1,600개 이상의 언어를 인식할 수 있는 기술이래. 특히 500개나 되는 소수 언어들은 지금까지 어떤 음성 인식 시스템에서도 지원되지 않았던 언어들이라 더 특별하대. 보통 음성 인식 기술은 주요 언어 몇 개만 지원하는데, 메타는 정말 다양한 언어를 포함시켰어. 이 기술을 직접 체험할 수 있는 데모 사이트도 있고, 연구자 매트 세츨러가 상세한 설명도 제공했대. 메타가 공개한 건 크게 두 가지야. 첫째는 1600개 이상 언어를 지원하는 3억에서 70억 파라미터 규모의 옴닐링구얼 ASR 모델. 둘째는 70억 파라미터 규모의 다국어 음성 표현 모델이래. 이 모델은 다른 음성 관련 작업에도 활용할 수 있을 거래. 이런 기술이 나오면 소수 언어를 쓰는 사람들도 음성 기술의 혜택을 받을 수 있을 거야. 정말 어마어마한 발전이지 않니? 🦉

첨부 미디어

@AIatMeta

12시간 전

Introducing Meta Omnilingual Automatic Speech Recognition (ASR), a suite of models providing ASR capabilities for over 1,600 languages, including 500 low-coverage languages never before served by any ASR system.

While most ASR systems focus on a limited set of languages that are https://t.co/D6Xv6c1MLy

Head over to the Omnilingual demo to explore the languages in the dataset: https://t.co/0ailqmwdaB

Omnilingual ASR was made possible by combining the capabilities of several other models developed by Meta. Matt Setzler, a researcher on the project, breaks it all down here. https://t.co/FZ0SeAvP3v

Today we’re releasing a full suite of models and a dataset:

1️⃣ Omnilingual ASR: A suite of ASR models ranging from 300M to 7B parameters, supporting 1600+ languages

2️⃣ Omnilingual w2v 2.0: a 7B-parameter multilingual speech representation model that can be leveraged for other

원본 보기

💬 0 댓글