In recent years, Large Language Models (LLMs) have demonstrated exceptional proficiency across a broad spectrum of Natural Language Processing (NLP) tasks, including Machine Translation. However, previous methods predominantly relied on iterative processes such as instruction fine-tuning or continual pre-training, leaving unexplored the challenges of training LLMs solely on parallel data. In this work, we introduce PLUME (Parallel Language Model), a collection of three 2B LLMs featuring varying vocabulary sizes (32k, 128k, and 256k) trained exclusively on Catalan-centric parallel examples. These models perform comparably to previous encoder-decoder architectures on 16 supervised translation directions and 56 zero-shot ones. Utilizing this set of models, we conduct a thorough investigation into the translation capabilities of LLMs, probing their performance, the impact of the different elements of the prompt, and their cross-lingual representation space.

本研究引入了PLUME（Parallel Language Model），该模型是由三个2B LLMs组成，采用不同词汇量（32k、128k和256k），并且完全基于加泰罗尼亚语为中心的平行语料进行训练。这些模型在16个有监督翻译方向和56个零样例下的翻译性能与之前的编码解码架构相当。利用这些模型，我们对LLMs的翻译能力进行了全面调查，探究其性能、提示的不同元素以及跨语言表示空间的影响。

通过仅使用平行数据训练的大型语言模型研究翻译能力