Parameter quantization for Large Language Models (LLMs) has attracted
increasing attentions recently in reducing memory costs and improving
computational efficiency. Early approaches have been widely adopted. However,
the existing methods suffer from poor performance in low-bit (such as 2 to 3
bits) scenarios. In this paper, we present a novel and effective Column-Level
Adaptive weight Quantization (CLAQ) framework by introducing three different
types of adaptive strategies for LLM quantization. Firstly, a K-Means
clustering based algorithm is proposed that allows dynamic generation of
quantization centroids for each column of a parameter matrix. Secondly, we
design an outlier-guided adaptive precision search strategy which can
dynamically assign varying bit-widths to different columns. Finally, a dynamic
outlier reservation scheme is developed to retain some parameters in their
original float point precision, in trade off of boosted model performance.
Experiments on various mainstream open source LLMs including LLaMA-1, LLaMA-2
and Yi demonstrate that our methods achieve the state-of-the-art results across
different bit settings, especially in extremely low-bit scenarios. Code will be
released soon.

该论文介绍了一种基于列级适应性权重量化（CLAQ）框架的参数量化方法，通过引入三种不同的自适应策略，可以在大规模语言模型中减少内存占用和提高计算效率。实验结果表明，在不同比特设置下，尤其是在极低比特情况下，该方法能够取得最先进的结果。

CLAQ：推动 LLM 的低比特后训练量化极限

CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs

Semantic communications learned on background knowledge bases (KBs) have been
identified as a promising technology for communications between intelligent
agents. Existing works assume that transceivers of semantic communications
share the same KB. However, intelligent transceivers may suffer from the
communication burden or worry about privacy leakage to exchange data in KBs.
Besides, the transceivers may independently learn from the environment and
dynamically update their KBs, leading to timely sharing of the KBs infeasible.
All these cause the mismatch between the KBs, which may result in a
semantic-level misunderstanding on the receiver side. To address this issue, we
propose a transceiver cooperative learning-assisted semantic communication
(TCL-SC) scheme against mismatched KBs. In TCL-SC, the transceivers
cooperatively train semantic encoder and decoder neuron networks (NNs) of the
same structure based on their own KBs. They periodically share the parameters
of NNs. To reduce the communication overhead of parameter sharing, parameter
quantization is adopted. Moreover, we discuss the impacts of the number of
communication rounds on the performance of semantic communication systems.
Experiments on real-world data demonstrate that our proposed TCL-SC can reduce
the semantic-level misunderstanding on the receiver side caused by the mismatch
between the KBs, especially at the low signal-to-noise (SNR) ratio regime.

研究了基于知识库的语义通信的误差问题，作者提出一种基于神经网络的协作学习方法，可以减少接收方的误解。

基于收发合作学习的语义通信在不匹配的背景知识库下的应用

Transceiver Cooperative Learning-aided Semantic Communications Against Mismatched Background Knowledge Bases

Continual learning tackles the setting of learning different tasks
sequentially. Despite the lots of previous solutions, most of them still suffer
significant forgetting or expensive memory cost. In this work, targeted at
these problems, we first study the continual learning process through the lens
of information theory and observe that forgetting of a model stems from the
loss of \emph{information gain} on its parameters from the previous tasks when
learning a new task. From this viewpoint, we then propose a novel continual
learning approach called Bit-Level Information Preserving (BLIP) that preserves
the information gain on model parameters through updating the parameters at the
bit level, which can be conveniently implemented with parameter quantization.
More specifically, BLIP first trains a neural network with weight quantization
on the new incoming task and then estimates information gain on each parameter
provided by the task data to determine the bits to be frozen to prevent
forgetting. We conduct extensive experiments ranging from classification tasks
to reinforcement learning tasks, and the results show that our method produces
better or on par results comparing to previous state-of-the-arts. Indeed, BLIP
achieves close to zero forgetting while only requiring constant memory
overheads throughout continual learning.

本文针对连续学习中可能出现的遗忘和内存成本昂贵的问题，通过信息论的视角提出一种新的 Bit-Level Information Preserving (BLIP) 方法，通过参数量化更新模型参数，以保留先前任务对模型参数的信息增益来避免遗忘。经过大量实验，该方法在分类任务和强化学习任务中均表现出优异的效果，实现了几乎零遗忘，且仅需要恒定的内存开销。