Mozilla hoping to open source voice samples for future AI devs

Prying open speech recognition

His master's voice

Mozilla has decided speech recognition should be open source, and has launched a project to achieve just that, Project Common Voice.

What the browser builder wants, it says, is an open source data set for voice recognition apps.

The open source community, Mozilla's Daniel Kessler writes, is the “next wave of innovators” – but with speech datasets locked up behind proprietary walls, they're left out.

That also skews speech recognition to the most lucrative markets (English, Chinese and “a select group of languages”), whereas Mozilla hopes enough participants will let speakers of less-common languages talk to their browsers.

And that's where the open data-gathering comes in: if you're interested, the Project Common Voice site lets users record their own voice (reading sentences to the system, starting for now with English), or review how accurately the software recognises other speakers.

Project Common Voice

(Vulture South's observation is that the page works better in Firefox than in Chrome – surprise! – and that naturally enough, you have to give the page permission to use your microphone.)

Ultimately the company wants to gather 10,000 hours of recordings for release in Q4 of this year. Presumably, once developers and researchers have their hands on the initial sample, the project will move on to other languages. ?


Biting the hand that feeds IT ? 1998–2017

  • 305452893 2018-01-22
  • 61770892 2018-01-22
  • 59080891 2018-01-22
  • 87471890 2018-01-22
  • 79096889 2018-01-22
  • 734763888 2018-01-22
  • 455411887 2018-01-22
  • 685280886 2018-01-22
  • 615657885 2018-01-22
  • 700163884 2018-01-21
  • 866691883 2018-01-21
  • 994750882 2018-01-21
  • 92145881 2018-01-21
  • 263961880 2018-01-21
  • 5823879 2018-01-21
  • 202428878 2018-01-21
  • 235407877 2018-01-21
  • 949120876 2018-01-21
  • 530375875 2018-01-21
  • 14090874 2018-01-21