There’s a big difference between hyperbole and the reality of AI content creation.
Technology is rapidly evolving, but anyone who’s worked in the industry can tell you that buzzwords and big promises rarely live up to the hype.
February 21, 2024
Learning a new language has never been easier. Thanks to programs like Rosetta Stone, Pimsleur, and the popular language app Duolingo, more people are learning languages each year. In fact, at the onset of the COVID-19 pandemic in 2020, more than 30 million people started using Duolingo. At the end of 2023, Duolingo reported more than 40 languages available in their app—an impressive feat for artificial intelligence software.
However, not everyone is quick to join Big Tech by offering their services and, most importantly, their language. Minority communities, especially indigenous communities, are cautious about sharing their collected data and funneling it into AI programs. In an attempt to preserve their languages and heritage, communities are pursuing data sovereignty to protect their indigenous identities.
In essence, data sovereignty builds off the idea that any data collected and stored is subject to the governance and laws of the nation or country it’s collected from. This means that if you want to use data from another country, you need to adhere to that country’s specific laws regarding that information. It can include how the data is used, stored, and shared.
For indigenous communities that have historically suffered from colonialism and wrongful acquisition—of businesses, works of art, and land, for example—the idea of data sovereignty acts as a form of protection. It reinforces their right of ownership and control over their data.Restoring entire societies’ native languages is an uphill battle, especially since many native speakers have since passed on. For those that remain, they must teach their language to younger generations. Another avenue, of course, is to use AI technology and the other tools available in our ever-growing digital world.
Peter-Lucas Jones and Keoni Mahelona of New Zealand work at the news outlet Te Hiku Media. They’ve engineered a computer with AI technology to help digitize hundreds of hours of recordings of te reo Māori, New Zealand’s native language. Like many other indigenous languages, the number of te reo Māori speakers plummeted from 90% to 12% of the Māori population within a generation—a direct impact of colonization.
Jones and Mahelona are on a mission to collect the language, sound, and stories of Aotearoa (the te reo Māori name for New Zealand) and prevent Big Tech companies from taking that data to sell for profit. This is where data sovereignty comes into play.
LIKE WHAT YOU’RE READING?
Get more, straight to your inbox.
Advancements in AI have taken off in recent years and continue to develop at a rapid pace. It may be hard to control sharing digital data at the global level, but it’s not impossible. For preservationists like Jones and Mahelona, the issue isn’t whether or not to use AI tools—they already are, with great success. The concern comes from how indigenous communities can legally manage and control the use of their data if and when they decide to share it.
Data sovereignty provides indigenous communities with a leg to stand on, so to speak, if companies like Duolingo or Google seek out these communities’ data for their own financial benefit.Many indigenous languages are lost to the world. For those that remain, we must ensure not only that we learn and preserve them, but that we protect them as well. By taking action through data privacy and establishing data sovereignty laws, the people who speak these languages can retain control over how others may use their data.