Prompt-based Continual Learning (PCL) has gained considerable attention as a
promising continual learning solution as it achieves state-of-the-art
performance while preventing privacy violation and memory overhead issues.
Nonetheless, existing PCL approaches face significant computational burdens
because of two Vision Transformer (ViT) feed-forward stages; one is for the
query ViT that generates a prompt query to select prompts inside a prompt pool;
the other one is a backbone ViT that mixes information between selected prompts
and image tokens. To address this, we introduce a one-stage PCL framework by
directly using the intermediate layer's token embedding as a prompt query. This
design removes the need for an additional feed-forward stage for query ViT,
resulting in ~50% computational cost reduction for both training and inference
with marginal accuracy drop < 1%. We further introduce a Query-Pool
Regularization (QR) loss that regulates the relationship between the prompt
query and the prompt pool to improve representation power. The QR loss is only
applied during training time, so there is no computational overhead at
inference from the QR loss. With the QR loss, our approach maintains ~ 50%
computational cost reduction during inference as well as outperforms the prior
two-stage PCL methods by ~1.4% on public class-incremental continual learning
benchmarks including CIFAR-100, ImageNet-R, and DomainNet.

通过引入一种单阶段的 PCL 框架，将中间层的标记嵌入作为提示查询，消除了查询 ViT 的额外前馈阶段，从而在训练和推理中将计算成本降低了约 50%，准确度仅下降不到 1%。此外，引入了查询池正则化损失（QR 损失），用于改进提示查询和提示池之间的关系，该损失仅在训练时应用，因此在推理阶段没有计算开销。通过引入 QR 损失，我们的方法在推理过程中仍然保持了约 50% 的计算成本降低，并且在包括 CIFAR-100、ImageNet-R 和 DomainNet 在内的公共类增量连续学习基准测试中优于之前的两阶段 PCL 方法约 1.4%。