Pretrained transformer-based models have shown high performance in natural
language generation task. However, a new wave of interest has surged: automatic
programming language generation. This task consists of translating natural
language instructions to a programming code. Despite the fact that well-known
pretrained models on language generation have achieved good performance in
learning programming languages, effort is still needed in automatic code
generation. In this paper, we introduce JaCoText, a model based on Transformers
neural network. It aims to generate java source code from natural language
text. JaCoText leverages advantages of both natural language and code
generation models. More specifically, we study some findings from the state of
the art and use them to (1) initialize our model from powerful pretrained
models, (2) explore additional pretraining on our java dataset, (3) carry out
experiments combining the unimodal and bimodal data in the training, and (4)
scale the input and output length during the fine-tuning of the model.
Conducted experiments on CONCODE dataset show that JaCoText achieves new
state-of-the-art results.

本文介绍了一种基于 Transformer 神经网络的模型 JaCoText，旨在将自然语言文本生成 java 源代码。通过在强大的预训练模型上初始化，探索我们的 java 数据集上的额外预训练，以及在训练中结合单模态和双模态数据的实验，以及在模型的微调期间缩放输入和输出长度等方法，经过 CONCODE 数据集的实验表明，JaCoText 取得了新的最先进的结果。

JaCoText：用于 Java 代码文本生成的预训练模型

JaCoText: A Pretrained Model for Java Code-Text Generation

Pretrained transformer-based models such as BERT have demonstrated
state-of-the-art predictive performance when adapted into a range of natural
language processing tasks. An open problem is how to improve the faithfulness
of explanations (rationales) for the predictions of these models. In this
paper, we hypothesize that salient information extracted a priori from the
training data can complement the task-specific information learned by the model
during fine-tuning on a downstream task. In this way, we aim to help BERT not
to forget assigning importance to informative input tokens when making
predictions by proposing SaLoss; an auxiliary loss function for guiding the
multi-head attention mechanism during training to be close to salient
information extracted a priori using TextRank. Experiments for explanation
faithfulness across five datasets, show that models trained with SaLoss
consistently provide more faithful explanations across four different feature
attribution methods compared to vanilla BERT. Using the rationales extracted
from vanilla BERT and SaLoss models to train inherently faithful classifiers,
we further show that the latter result in higher predictive performance in
downstream tasks.

本文探讨了如何提高基于预训练 Transformer 的模型的预测性能以及其对于预测理由的准确性，并提出了一种名为 SaLoss 的辅助损失函数，该函数利用 TextRank 方法从训练数据中提取信息以辅助 BERT 进行下游任务的微调，实验证明使用该函数训练的模型比普通 BERT 模型更加准确和有说服力，提高了在下游任务中的预测性能。