Automatically translated from Basque, translation may contain errors. More information here. Elhuyarren itzultzaile automatikoaren logoa

How has artificial intelligence been used to spread the Basque language?

  • Among them, Naiara Perez, researcher at the Linguistic Technology Center HI-TZ at the UPV/EHU; Itziar Cortes, researcher at Elhuyar; and Eli Pombo, manager at Iametza. Among other things, they will talk about the place that Euskera has in artificial intelligence, about its possibilities, challenges and difficulties.
Zarata mediatikoz beteriko garai nahasiotan, merkatu logiketatik urrun eta irakurleengandik gertu dagoen kazetaritza beharrezkoa dela uste baduzu, ARGIA bultzatzera animatu nahi zaitugu. Geroz eta gehiago gara, jarrai dezagun txikitik eragiten.

It's on everyone's lips, and in everyday life, artificial intelligence is used more than it's thought. But what is it? “To give a simple definition, I would say that it is a field between mathematics and informatics, in which computers are given human competencies, such as linguistic ability, vision, movement…”, explains Naiara Pérez. The mind can be accurate or general. That is to say, the specific serves for some actions, such as, for example, accessing a tunnel and automatically igniting the car lights or asking things to voice assistants. On the contrary, generic is something similar to human beings and is able to perform more than one action, as is the case of the artificial interlocutor ChatGPT of Open Al.

Artificial intelligence works through neural networks. Neural networks try to mimic with the instruments of computation the human nervous system and the way to learn. To do this, it is necessary to inform and design learning algorithms that enable the tools to learn patterns. “We haven’t yet arrived, but we’ve started to really think about whether artificial intelligence can represent the mind of a human being in the coming years,” Pérez said.

However, Pombo has made a statement in which he recalls that artificial intelligence is a tool: “I wouldn’t let us sit back. We must use technology rationally, for truly useful uses and ethically.”

Eli Pombo, Iametza “I feel like we’re trying to lose the train and make room, but I’m not pessimistic. Apart from the difficulties, I think we also have a favorable wind.”

Artificial intelligence can be applied in many sectors, including language processing. This includes, inter alia, systems for the simultaneous recognition and translation of the written language, systems for the reception and writing of the oral language and systems for the conversion of voice texts. According to Pérez, in recent years in the Basque Country there are “many” researchers in linguistic technology and development, and there is a “strong” sector: “We are emerging from us”. Therefore, there is a sector that investigates in Euskera, but in all this whirlwind of technology, what is the place that the Basque country has? What benefits and disadvantages does it have? What difficulties and possibilities does the Basque Country have?

“Artificial intelligence is the reality around us and the Basque country has to be there. Otherwise, it loses opportunities in the digital presence,” Pombo stressed. In addition, he added that this is an opportunity for the Basque people to spread to other places. Pérez has joined in this and pointed out that large-scale tools such as ChatGPT are often in the hands of large companies: “We cannot wait for what big companies are going to do. The priority of companies like Google and Microsoft is not to take all the languages of the world into account. These companies focus on the languages with the highest number of clients, i.e. the hegemonic languages”. However, he added that this type of company has also begun to integrate Euskera.

Pombo feels that the researchers working in Basque in this sense are “fighting”: “I feel that we are trying to lose the train and make room, but I am not pessimistic. Apart from the difficulties, I think we also have a favorable wind.” He also says that there is "a lot of will" to do things by public institutions and citizens. In addition, according to Pombo, it is “impossible” to compete against the advances made by large companies: “We have to keep doing things with common sense and without frustrating.”

Elhuyar has, among others, the Elia and Entzun tools. The neural translator Elia translates in a few seconds simple texts and formatted documents. The expert platform processes previously recorded audio and video files and creates their transcripts and subtitles.

Technological sovereignty Because technology is based on data in general, Pombo believes that in order not to make information available to large companies it is necessary to manage it “in a sovereign way”: “Free software allows you to improve what you have done for you and strengthen the local economy”. In this sense, Cortés has warned that the use of this data by users has been
"rational". Therefore, customer data is not used in Elhuyar to train tools: “We have to be vigilant, we don’t read the small print or don’t warn us and we’re feeding these systems unintentionally.” Pérez

has also stressed the origin of the data and stated that in free models the ethical aspect is "cleaner". They have also underlined the impact of working from local needs and aspirations. “If we do, we will create the content of the issues that interest us,” said Pérez. He added that the creation of “cutting-edge technology” contributes to “nourishing” the technological sector and the research sector in the Basque Country: “If we work openly, we can promote collaboration between the research centers here.”

Naiara Perez, I.T. Center. : “We cannot wait for what big companies do. The priority of companies like Google and Microsoft is not to take all the languages of the world into account.”

There are many who make available the other tools to develop technology. An example of this is the Latxa created by the HiTZ Language Technology Center of the UPV/EHU. Language is a great model, and when you give a string of texts to these kinds of models, they give the word more likely. “Latxa does nothing on its own, it’s a motor that generates other applications,” explains Pérez. For example, the spelling corrector is used to create applications for answering questions and automatic exercises in teaching: “We, for example, have included Latxa as a user in the game Once a day to give answers.” It is available on the network and can be downloaded by anyone.

Lack of information,
difficulty Data
is the treasure of artificial intelligence. In fact, training tools requires as many quality data as possible. For example, creating a system that listens to the voice and transcribes it directly requires many recordings and transcripts. The interviewed researchers have emphasized that one of the great difficulties the Basque country has is obtaining large amounts of data. “Compared to the hegemonic languages, the Basque language does not have so much content, and then it is more difficult to obtain results,” said Pérez. However, Cortés has added to this, although he believes that, compared to other minority languages, there are "more" contained in euskera.La most of the current tools are formed in

unified Basque, although there are also tools that work the dialects. The Batua translation service, for example, is based on artificial intelligence and neural networks and is a project developed by the Vicomtech technology center and promoted by Euskaltel, Mondragonlingua and EITB. This translation service is familiar with Basque, French, Spanish, English and Biscayan. In addition, it is able to make translations between the Basque Country and all those languages. “In the case of the Basque dialects it is much more difficult to obtain data; if the Basque country costs in the batua, think of the Biscayan or the labortan,” said Pérez. As for the dialects, the lack of norms and variants existing in each area hinder the process: “The Biscayan is not united, so if we feed the machine with the dialects of Getxo, Gernika or Ondarroa, it is very difficult to create a praton.” Despite the difficulties, this does not mean that quality content

is not generated in the Basque Country. In fact, Cortés stressed that "much" is taken care of the content generated in Euskera: “With the data we have, we’re getting very tidy results.” In addition, he added that artificial intelligence has opened other doors to the Basque country, as they were "bounded" with old systems. He explains that with the first systems they were unable to create and make available “truly useful” systems.

Translated into practice, it explains the advances in machine translation: “It didn’t work well before, but now it did.” In 2007, Elhuyar and the UPV created Matxin, the first free automatic translator. The system at the time was not based on artificial intelligence. “In the case of the Basque Country, the results it gave were not close to those we have today. Until 2016, we did not achieve a quality system for translation between Basque and Spanish.” Today, Elia is known as the automatic translator.

On the contrary, in the case of other languages, for example, translations between Spanish and Galician, old systems are currently used which are not yet based on artificial intelligence: “In the nearest languages or with similarities, very good results were obtained. In the case of the Basque Country there are declinations, the verbs are different and the order of words is free. This creates difficulties in creating rules of passage from one language to another. Thanks to artificial intelligence today, there are almost no limits."

Itziar Cortes, Elhuyar: “If we use the automatic translator to translate to Basque what is not in Basque and we do not see if the translation is ok, we do not favor the Basque”

As technology is able
to do more and more things, Cortés believes that we have to be “reasonable.” Otherwise, instead of being an instrument for the dissemination of the Basque Country, it says that it can be counterproductive: “If we use the content created in Euskera to translate it, we are spreading the content that has somehow been created in Euskera. But if we use the automatic translator to translate to Basque what is not in Basque and we do not look at whether the translation is fine, we do not favor the Basque”. He added that this will cause us to have “low quality” texts in the medium term: “If we use these texts to train future systems, the quality of the Basque country will be low”. Besides reviewing the content generated, what can

be done to ensure quality? The HiTZ Language Technology Centre has adopted several avenues for this purpose. Firstly, the contents of the Basque media with a Creative Commons license have been used. Because these data are not enough, large files have also been used and their content has been filtered through a filter. “We’ve got 4,000,000 documents, but it’s not enough to create a tool like ChatGPT; however, we’ve achieved good results.” Another

example is the Elia, Entzun and TTS tools that Elhuyar has. All three are based on artificial intelligence. The neural translator Elia translates in a few seconds simple texts and formatted documents. The expert platform processes previously recorded audio and video files and creates their transcripts and subtitles. The neural TTS turns the text into a voice. Elhuyar technologies know Basque, Spanish, French, English, Catalan and Galician. This means that Elia, Entzun and TTS can be used in those six languages. It's Elhuyar's latest news: “We’ve seen that customers are multilingual and want to use our technology in languages other than Basque. Having Euskera as the main axis, they can use the only tool in more languages.”

Elhuyar zientzia.eus has an integrated neural translator on its web. Note: “Text written in Basque and automatically translated through Elia, without subsequent supervision.” In this example, the text appears in catanata.

Cortés believes that all these tools can help “expand” the Basque: “We don’t have to be afraid to create in Basque. If we want to reach more people, in addition to professional translators, today we have many tools.” In Elhuyar, for example, most of the time they create things in Euskera, but they translate a few things out of Euskal Herria with an automatic translator. Always alerting the user and offering the possibility to access the version original.Por example,

on the portal zientzia.eus they have an integrated automatic translator. Does this mean that if a person puts the Como xurdiu or solar system on the Internet? - How did the solar system come about? in Galician–, which is accessible to the portal zientzia.eus. The website offers the possibility to read in another language and see the original version. “We are clear that the reader has to know that it is a text made by an automatic translator, and not by a person,” said Cortés.

ARGIA has recently participated in the Itzulinguru project and, like the web zientzia.eus, has integrated the automatic translator into the experiment proposed by the Sociolinguistics Cluster and the research group Innoklab of the UPV/EHU. This project has been supported by: Elhuyar, AEK, Orai, Center for Artificial Intelligence, Osakidetza, Department of Education of the Basque Government and Hekimen.Aunque use automatic

translator, do not leave everything in the hands of tools. Elhuyar members review the content to ensure its quality. Sometimes it is researchers and other professional translators: “We are clear: professional translators are needed because the automatic does not achieve 100% of the quality. Furthermore, you cannot leave anyone who knows whether a text is good or not, because we don’t all have the same criteria.” All these tools and resources currently

available can help in language learning. However, what happens if, instead of awakening the desire to learn, they don’t want to learn the language? According to Pérez, although the desire or the need to learn a language is linked to the need to communicate, it is not limited to this: “As far as Euskera is concerned, I don’t think anyone learns to communicate on their own. It's a choice, and there's a lot closer together. Really, if you want to learn Basque, French, Arabic or any language, these kinds of tools will make the way easier, but a translator will not give you the pleasure of reading directly in Basque.”


You are interested in the channel: Adimen artifiziala
2024-06-21 | Sustatu
Trap in a AA image contest with real photo
In an Internet-based photography competition, in 1839 Color Photography Awards, a trap is produced. They created a special category for the 2024 edition, in which they presented images of Artificial Intelligence, and the winner cheated, a prize that he won by presenting the real... [+]

Half of the ESO students use artificial intelligence to perform school work, according to a survey.
97% of Compulsory Secondary Education students have used artificial intelligence. They use it to search for information, to write, to do homework or to answer test questions. They report the data and concerns of the survey of 1,006 students from 63 educational centers in the... [+]

Israel leaves the killing of Palestinians in the hands of artificial intelligence
The headline is read and someone thinks it is an exaggeration, an excessive generalization of the journalist. 'Lavender': The AI machine directing Israel's bombing spree in Gaza. These are the six people who have been in Gaza since 7 October last year, following artificial... [+]

2024-04-17 | Reyes Ilintxeta
Elisabeth Pérez. Proponent of creation
"Artificial Intelligence is a tool of the future, but its essence is to steal creations from the past"
I knew the work of the creators Elisabeth at the bookstore congress held in March in Pamplona, passionately defending creative artificial intelligence. Soon after we stayed to interview before going to the Bologna Book Fair and Colombia. I recognize that I imagined the work of... [+]

Are civil rights in danger in Europe?

On 8 December 2023, the European Union (EU) approved the first comprehensive regulation of artificial intelligence, but according to an internal document acquired by the Political Weekly "an irresponsible and disproportionate use of biometric identification technology, such as... [+]


New forms of digital violence: Unaccepted synthetic pornography

In the digital age, we have more and more examples of how technology affects human intimacy and alarming phenomena are occurring. Last synthetic pornography not accepted. This term refers to manipulating images or videos through artificial intelligence. To create pornographic... [+]


Technology
Driven by efficiency

In my time as a student, I was taught the difference behind the word efficacy and efficiency. At that time efficiency seemed to me to be an objective, today I'm not so sure. Efficiency through digital change has brought us many benefits, at least for those of us who do not at... [+]


Parliament adopts the Artificial Intelligence Act
On Wednesday, Parliament passed an artificial intelligence law that ensures security and respect for fundamental rights, while driving innovation.

Domestic History Jobs in the Artificial Intelligence Era

Not long ago we had a meeting of university professors in which the atmosphere of concern prevailed over the changes that Artificial Intelligence (AI) has produced in the teaching of history. His colleagues did not know what to respond to this development. In those dark sins... [+]


Yuk Hui. How many heads, different technologies
"I don't think we can get a fairer world by collecting more data."
Last January, the philosopher Yuk Hui offered a conference, or rather, at the Tabakalera de Donostia. His work has had a great impact in recent years and was full of height. Hui has criticised, among other things, the line of thought of monotechnology, which could be compared to... [+]

That fascination with technology doesn't yield to education.
We have received the master class of experts Pablo Garaizar and Diana Franco: “Spaces without technology are necessary”.

Adrián Almazán
"With the algorithmic logic of society we will continue to exploit human beings"
Bachelor of Physics and Doctor of Philosophy, Adrián Almazán (Madrid, 1990) has his usual residence in Vitoria, but currently teaches at the Carlos III University of Madrid. It analyzes the effects of digitalisation on the environment, energy and society and summarizes them in... [+]

2024-02-21 | Reyes Ilintxeta
Mikel Galar. Natural Smart
"We must be careful when giving the data, but without making it paranoid."
Artificial intelligence (AI) is the field of informatics that aims to create systems that can perform tasks such as learning, reasoning or perception that require human intelligence. The Pamplona professor and researcher Mikel Galar, expert and expert in this, has given us a... [+]

Eguneraketa berriak daude