Automatically translated from Basque, translation may contain errors. More information here. Elhuyarren itzultzaile automatikoaren logoa

Latxa: Hitz creates the largest and free linguistic model in Basque

  • Recently the great model of free Catalan language called Aina Flor was introduced, and in the news last week we said that the director of the Hitz Basque Centre, Eneko Agirre, announced that he was also coming in Euskera shortly. And just yesterday, the Hitz Center became public. Latxa. LLM is a great linguistic model, a superdatabase on which artificial intelligence initiatives are based. LLMs are the basis for OpenAI ChatGPT versions, for example. Now we have one of these, in Basque (well, lots of real models, made up of 3 corpus).
Artikulu hau CC BY-SA 3.0 lizentziari esker ekarri dugu.

30 January 2024 - 07:30

According to Hitz Zentroa "is the family of open models" Latxa, which includes the "largest linguistic model in Basque". It is built on the linguistic model Meta or Facebook Llama 2 and follows its license. Llama 2 has already seen excellent results in Basque, able to perform a correct oral machine translation in Basque via the product Seamless M4T. Latxa’s logo is precisely the one that links Llama and the Basque sheep, although there is also a connection in the name (as we thought).

Latxa collects models of between 7 and 70 billion parameters. Regarding the set of texts for the construction of models, Basque researchers have used EusCrawl, a set of texts in Basque of 1.72 million documents and 288 million words. EusCrawl was extracted from 33 quality websites, offering higher quality than other corpus training techniques from the Internet.

In fact, Latxa has not been done for the general public, that will come later. However, the three models are available on the Huwaukee Face platform and can be used by the expert engineer by checking the “model card”, where the instructions for technical information and initiating the use of the models are located.

The development of Latxa has been the result of a research, innovation and development initiative, which is part of the IKER-GAITIK project, supported by the Basque Government, in cooperation with the European EuroHpc programme.

Today's language models have amazing performance, like English ChatGPT or English Bard. However, in the case of minority languages and the Basque language no. With these models he took a step in the session of Hitz Zentroa to turn the situation around, and according to his data, Latxa responds better than other systems to formulations in Basque.

More information, here.

In Hugginface: Latxa.


You are interested in the channel: Adimen artifiziala
2025-02-26 | Mikel Zurbano
DeepSeeken astindua

Silicon Valley-ko oligarkia AEBetako gobernura iritsi berritan lehertu da adimen artifizialaren (AA) burbuila. Txip aurreratuen erraldoia den Nvidia-k urtarrilaren amaieran izandako %16,8ko balio galera, egun bakar batean inoiz izan den burtsa balio galerarik handiena da... [+]


Adimen artifizala zineman: legezkoa bai, baina bidezkoa?

Geroz eta ekoizpen gehiagok baliatzen dituzte teknologia berriak, izan plano orokor eta jendetsuak figurante bidez egitea aurrezteko, izan efektu bereziak are azkarrago egiteko. Azken urtean, dena den, Euskal Herriko zine-aretoak gehien bete dituztenetako bi pelikulek adimen... [+]


Diario de Noticias de Álava egunkariko langileen salaketa
“Adimen artifiziala horrela erabiltzea kazetaritzari eta irakurleari iruzur egitea da”

Diario de Noticias de Álava (DNA) egunkariko langileak sinadura greban daude, eta aspaldi ari dira beren lan baldintza “miserableak” eta horiek kazetaritzaren kalitatean duen eragina salatzen. 2013tik soldatak izoztuta dituzte, eta ordutik erosahalmenaren %30... [+]


For algorithmic sabotage

“I will overturn, overturn, overturn, it[...]”
Ezekiel 21:27 – King James Version

“Above all algorithms, below all algorithms”
Xabier Landabidea

I’m uncomfortable, uncomfortable with the almost religious attitude our society has taken towards technology, and... [+]


Europe pledges to invest €200 billion in artificial intelligence
On February 10 and 11, a hundred heads of state met in Paris, together with representatives of the Artificial Intelligence sector, for the fourth summit on AI. Although there have been concerns and questions about this technology that is developing at high speed, it was not a... [+]

DeepSeek, the new trench of geopolitics
DeepSeek gives a beautiful slap in the face to Silicon Valley’s biggest tech companies. The cheap and new Chinese Artificial Intelligence chatbot has questioned US superiority in this field and has shown that billions of dollars are not needed to make advanced and efficient... [+]

Teknologia
Bilakaera kontzientea

Zer jakin behar dut? Norekin erlazionatu behar dut? Non bizi behar dut? Ardura horiekin gabiltza gizakiok gure gizarteen baitan bizitza on baten ideia bizitzeko bidean. Ondo erantzuten ez badakigu, bazterretan geratuko garen beldurrez.

Joan den astean, kanpoan geratzearen... [+]


2025-02-11 | Sustatu
If the Met has hacked into Libgen, why not?
The Met (the owner of Facebook) has reportedly hacked Libgen’s network library using BitTorrent protocols. They got at least 81.7 terabytes in this way to power their artificial intelligence systems. In order to get the essay and science in the main international halves for... [+]

2025-01-29 | Sustatu
The amazing Chinese Deepseek AA (and excellent in Basque)
The Chinese artificial intelligence system Deepseek has become news in recent days. As it has expanded, the model is much cheaper in development and consumption, but it has taken a leap to the level where Claude or ChatGPT have reached. In addition, it has been distributed with... [+]

Pictures of DeepSeek
Envy China: The Trade War Is Called Artificial Intelligence
ChatGPT is no longer "too much". A more intense, accessible and cheaper competitor has stood in front of the US and world leader: DeepSeek has seen the light of the hand of a Chinese company. The world has danced and not expressly for the good. The American microchip company... [+]

2025-01-24 | Sustatu
News from TeknoTrump: According to the oligarchs, the Stargate AA initiative, and the TikTok extension
Donald Trump’s mandate has begun to impose reactionary policies against anything that could be woke or DEI, and also by announcing technological measures. From the first moment, when he took the oath of office, the presence of the heads of the tech giants on the front line was... [+]

Trump announces the largest investment in history to boost Artificial Intelligence
The newly appointed President of the United States has announced an investment of $500 billion for the project called the Stargate. Companies such as OpenAI, SoftBank and Oracle will participate.

2025-01-10 | Sustatu
Apple invents extracts of news without worrying
Extension of the Artificial Intelligence Nonsense Store. Those created by ordinary users can be largely unwise, but the Internet giants themselves are repeating these situations and this seems more serious, as they can have a global impact. A serious case that can be highlighted... [+]

2024-11-08 | Leire Ibar
More than half of pharmaceutical companies use artificial intelligence
Developing medicines "to accelerate, personalize treatments and optimize internal processes. 33% of companies use artificial intelligence in disease analysis and 29% in drug development and manufacturing.

Technology
Response of the Creator AA

Many years ago, Dr. I knew the abuse chatbot, and I also realized the speed at which people can engage with these machines. Being social animals, the relationship is natural and necessary, and as the name 'relationship' says, it always leads to a response from the other. Receiving... [+]


Eguneraketa berriak daude