BriefGPT.xyz
Jun, 2023
大规模多语种情感数据集和多方面情感分类基准
Massively Multilingual Corpus of Sentiment Datasets and Multi-faceted Sentiment Classification Benchmark
HTML
PDF
Łukasz Augustyniak, Szymon Woźniak, Marcin Gruza, Piotr Gramacki, Krzysztof Rajda...
TL;DR
该研究提供了一个由79个数据集组成的开放性跨语言语料库,可用于培训情感模型,同时展现了基于不同基础模型、训练目标、数据集集合和微调策略所进行的数百次实验的多方面情感分类基准。
Abstract
Despite impressive advancements in multilingual corpora collection and model
training
, developing large-scale deployments of
multilingual models
still presents a significant challenge. This is particularly true f
→