This work suggests fundamentally rethinking the current practice of pruning
large language models (LLMs). The way it is done is by divide and conquer:
split the model into submodels, sequentially prune them, and reconstruct
predictions of the dense counterparts on small calibration data one at a time;
the final model is obtained simply by putting the resulting sparse submodels
together. While this approach enables pruning under memory constraints, it
generates high reconstruction errors. In this work, we first present an array
of reconstruction techniques that can significantly reduce this error by more
than $90\%$. Unwittingly, however, we discover that minimizing reconstruction
error is not always ideal and can overfit the given calibration data, resulting
in rather increased language perplexity and poor performance at downstream
tasks. We find out that a strategy of self-generating calibration data can
mitigate this trade-off between reconstruction and generalization, suggesting
new directions in the presence of both benefits and pitfalls of reconstruction
for pruning LLMs.

通过分割模型、顺序修剪、重构稠密对应模型的预测，及时合并稀疏子模型，本文首次提出了一系列重建技术，可以显著降低高复原误差，并发现最小化复原误差并非总是理想的，引入自动生成校准数据的策略以平衡复原和泛化之间的权衡，为剪枝大型语言模型的新方向提供了新思路。

重新思考大型语言模型剪枝：重构误差最小化的好处和陷阱

Rethinking Pruning Large Language Models: Benefits and Pitfalls of  Reconstruction Error Minimization

Relation Extraction (RE) serves as a crucial technology for transforming
unstructured text into structured information, especially within the framework
of Knowledge Graph development. Its importance is emphasized by its essential
role in various downstream tasks. Besides the conventional RE methods which are
based on neural networks and pre-trained language models, large language models
(LLMs) are also utilized in the research field of RE. However, on low-resource
languages (LRLs), both conventional RE methods and LLM-based methods perform
poorly on RE due to the data scarcity issues. To this end, this paper
constructs low-resource relation extraction datasets in 10 LRLs in three
regions (Central Asia, Southeast Asia and Middle East). The corpora are
constructed by translating the original publicly available English RE datasets
(NYT10, FewRel and CrossRE) using an effective multilingual machine
translation. Then, we use the language perplexity (PPL) to filter out the
low-quality data from the translated datasets. Finally, we conduct an empirical
study and validate the performance of several open-source LLMs on these
generated LRL RE datasets.

该研究构建了十个低资源语言的关系抽取数据集，并利用语言困惑度对翻译数据进行过滤，最后在这些数据集上评估了开源大型语言模型的性能。

低资源情境下，LLM 在关系抽取中的表现如何？综合评估

How Good are LLMs at Relation Extraction under Low-Resource Scenario?  Comprehensive Evaluation

Prior study has shown that pretrained language models (PLM) can boost the
performance of text-based recommendation. In contrast to previous works that
either use PLM to encode user history as a whole input text, or impose an
additional aggregation network to fuse multi-turn history representations, we
propose a unified local- and global-attention Transformer encoder to better
model two-level contexts of user history. Moreover, conditioned on user history
encoded by Transformer encoders, our framework leverages Transformer decoders
to estimate the language perplexity of candidate text items, which can serve as
a straightforward yet significant contrastive signal for user-item text
matching. Based on this, our framework, UniTRec, unifies the contrastive
objectives of discriminative matching scores and candidate text perplexity to
jointly enhance text-based recommendation. Extensive evaluation shows that
UniTRec delivers SOTA performance on three text-based recommendation tasks.
Code is available at this https URL

本文提出了一种名为 UniTRec 的框架，它使用预训练语言模型来增强基于文本的推荐系统，其中该框架使用 Transformer 编码器和解码器来处理用户历史和候选文本，利用语言困惑度作为对比信号进行匹配，展现了 SOTA 的表现。