In Environment Design, one interested party seeks to affect another agent's
decisions by applying changes to the environment. Most research on planning
environment (re)design assumes the interested party's objective is to
facilitate the recognition of goals and plans, and search over the space of
environment modifications to find the minimal set of changes that simplify
those tasks and optimise a particular metric. This search space is usually
intractable, so existing approaches devise metric-dependent pruning techniques
for performing search more efficiently. This results in approaches that are not
able to generalise across different objectives and/or metrics. In this paper,
we argue that the interested party could have objectives and metrics that are
not necessarily related to recognising agents' goals or plans. Thus, to
generalise the task of Planning Environment Redesign, we develop a general
environment redesign approach that is metric-agnostic and leverages recent
research on top-quality planning to efficiently redesign planning environments
according to any interested party's objective and metric. Experiments over a
set of environment redesign benchmarks show that our general approach
outperforms existing approaches when using well-known metrics, such as
facilitating the recognition of goals, as well as its effectiveness when
solving environment redesign tasks that optimise a novel set of different
metrics.

环境设计中，通过对环境进行改变来影响其他个体的决策。本论文提出了一种通用环境重设计方法，不依赖具体指标以及不同目标，通过利用最新的高质量规划研究，有效地根据任何感兴趣个体的目标和指标优化规划环境。实验证明，在使用熟知的指标（如目标识别）以及解决优化了不同指标的环境重设计任务时，本方法表现优于现有方法。

泛化规划环境重构

Generalising Planning Environment Redesign

In this lecture, we present a general perspective on reinforcement learning
(RL) objectives, where we show three versions of objectives. The first version
is the standard definition of objective in RL literature. Then we extend the
standard definition to the $\lambda$-return version, which unifies the standard
definition of objective. Finally, we propose a general objective that unifies
the previous two versions. The last version provides a high level to understand
of RL's objective, where it shows a fundamental formulation that connects some
widely used RL techniques (e.g., TD$(\lambda)$ and GAE), and this objective can
be potentially applied to extensive RL algorithms.

该论文提出了一种泛化的强化学习目标函数，其中包括标准的目标定义、扩展的 λ 回报版本和通过统一前两个版本提出的强化学习的目标函数，它可以高级地理解强化学习的目标，并连接一些广泛使用的强化学习技术（例如 TD (lambda) 和 GAE），这个目标函数可能适用于广泛的强化学习算法。

强化学习目标的一般视角

A General Perspective on Objectives of Reinforcement Learning

Data augmentation is an effective way to improve the performance of many
neural text generation models. However, current data augmentation methods need
to define or choose proper data mapping functions that map the original samples
into the augmented samples. In this work, we derive an objective to formulate
the problem of data augmentation on text generation tasks without any use of
augmented data constructed by specific mapping functions. Our proposed
objective can be efficiently optimized and applied to popular loss functions on
text generation tasks with a convergence rate guarantee. Experiments on five
datasets of two text generation tasks show that our approach can approximate or
even surpass popular data augmentation methods.

该文提出一种没有使用特定映射函数构造数据增强数据的方法来解决文本生成任务中的数据增强问题，该方法可以有效地优化并应用于文本生成任务的流行损失函数，收敛速率得到保证，实验结果证明该方法可以达到甚至超过流行的数据增强方法。