According to Hitz Zentroa "is the family of open models" Latxa, which includes the "largest linguistic model in Basque". It is built on the linguistic model Meta or Facebook Llama 2 and follows its license. Llama 2 has already seen excellent results in Basque, able to perform a correct oral machine translation in Basque via the product Seamless M4T. Latxa’s logo is precisely the one that links Llama and the Basque sheep, although there is also a connection in the name (as we thought).
Latxa collects models of between 7 and 70 billion parameters. Regarding the set of texts for the construction of models, Basque researchers have used EusCrawl, a set of texts in Basque of 1.72 million documents and 288 million words. EusCrawl was extracted from 33 quality websites, offering higher quality than other corpus training techniques from the Internet.
In fact, Latxa has not been done for the general public, that will come later. However, the three models are available on the Huwaukee Face platform and can be used by the expert engineer by checking the “model card”, where the instructions for technical information and initiating the use of the models are located.
The development of Latxa has been the result of a research, innovation and development initiative, which is part of the IKER-GAITIK project, supported by the Basque Government, in cooperation with the European EuroHpc programme.
Today's language models have amazing performance, like English ChatGPT or English Bard. However, in the case of minority languages and the Basque language no. With these models he took a step in the session of Hitz Zentroa to turn the situation around, and according to his data, Latxa responds better than other systems to formulations in Basque.
More information, here.
In Hugginface: Latxa.
Silicon Valley-ko oligarkia AEBetako gobernura iritsi berritan lehertu da adimen artifizialaren (AA) burbuila. Txip aurreratuen erraldoia den Nvidia-k urtarrilaren amaieran izandako %16,8ko balio galera, egun bakar batean inoiz izan den burtsa balio galerarik handiena da... [+]
Geroz eta ekoizpen gehiagok baliatzen dituzte teknologia berriak, izan plano orokor eta jendetsuak figurante bidez egitea aurrezteko, izan efektu bereziak are azkarrago egiteko. Azken urtean, dena den, Euskal Herriko zine-aretoak gehien bete dituztenetako bi pelikulek adimen... [+]
Diario de Noticias de Álava (DNA) egunkariko langileak sinadura greban daude, eta aspaldi ari dira beren lan baldintza “miserableak” eta horiek kazetaritzaren kalitatean duen eragina salatzen. 2013tik soldatak izoztuta dituzte, eta ordutik erosahalmenaren %30... [+]
“I will overturn, overturn, overturn, it[...]”
Ezekiel 21:27 – King James Version
“Above all algorithms, below all algorithms”
Xabier Landabidea
I’m uncomfortable, uncomfortable with the almost religious attitude our society has taken towards technology, and... [+]
Zer jakin behar dut? Norekin erlazionatu behar dut? Non bizi behar dut? Ardura horiekin gabiltza gizakiok gure gizarteen baitan bizitza on baten ideia bizitzeko bidean. Ondo erantzuten ez badakigu, bazterretan geratuko garen beldurrez.
Joan den astean, kanpoan geratzearen... [+]
Many years ago, Dr. I knew the abuse chatbot, and I also realized the speed at which people can engage with these machines. Being social animals, the relationship is natural and necessary, and as the name 'relationship' says, it always leads to a response from the other. Receiving... [+]