For many people around the world, speaking to devices has become second nature, whether to get directions, check the news, or dictate voice notes. However, this convenience often disappears when technology cannot understand local languages-a reality for hundreds of millions of people, particularly in Sub-Saharan Africa, where over 2,000 distinct languages are spoken. The main challenge in developing inclusive voice technology for the region has been the lack of accessible, high-quality speech data.
To address this gap, researchers have introduced WAXAL, a dataset named after the Wolof word for 'speak." Developed over three years, WAXAL aims to empower researchers and support the creation of inclusive technology across Africa. The dataset covers 21 languages, including Acholi, Hausa, Luganda, and Yoruba, and comprises over 11,000 hours of speech data from nearly two million recordings. It includes approximately 1,250 hours of transcribed speech for automatic speech recognition ASR and more than 20 hours of studio recordings for text-to-speech TTS applications.
The project is a collaborative effort led by African institutions and experts. Makerere University in Uganda and the University of Ghana collected data for 13 languages, while Digital Umuganda in Rwanda led data collection for five additional languages. High-quality studio recordings were produced in partnership with Media Trust and Loud n Clear, and the African Institute for Mathematical Sciences AIMS contributed multilingual datasets for future expansions. The framework ensures that partners retain ownership of the data they collected while making resources available to the global research community.