We present a self-learning approach for synthesizing programs from integer
sequences. Our method relies on a tree search guided by a learned policy. Our
system is tested on the On-Line Encyclopedia of Integer Sequences. There, it
discovers, on its own, solutions for 27987 sequences starting from basic
operators and without human-written training examples.

我们提出了一种自学习方法，用于从整数序列合成程序。我们的系统测试了在 OEIS 表格上，通过基本操作并在没有人工训练例子的情况下，自主发现了 27987 个序列的解决方案。

从零开始学习整数序列的程序综合

Learning Program Synthesis for Integer Sequences from Scratch

Robotic manipulation in complex open-world scenarios requires both reliable
physical manipulation skills and effective and generalizable perception. In
this paper, we propose a method where general purpose pretrained visual models
serve as an object-centric prior for the perception system of a learned policy.
We devise an object-level attentional mechanism that can be used to determine
relevant objects from a few trajectories or demonstrations, and then
immediately incorporate those objects into a learned policy. A task-independent
meta-attention locates possible objects in the scene, and a task-specific
attention identifies which objects are predictive of the trajectories. The
scope of the task-specific attention is easily adjusted by showing
demonstrations with distractor objects or with diverse relevant objects. Our
results indicate that this approach exhibits good generalization across object
instances using very few samples, and can be used to learn a variety of
manipulation tasks using reinforcement learning.

本文提出了一种方法来解决复杂开放环境下机器人操作的问题，该方法基于先前训练的通用视觉模型作为感知系统的对象先验，并引入了一个基于对象的注意机制来确定相关对象，通过少数轨迹或演示将这些对象纳入学习策略，使用强化学习可以学习多种操作任务。

面向通用机器人学习的深度目标中心表示

Deep Object-Centric Representations for Generalizable Robot Learning

Conventional wisdom holds that model-based planning is a powerful approach to
sequential decision-making. It is often very challenging in practice, however,
because while a model can be used to evaluate a plan, it does not prescribe how
to construct a plan. Here we introduce the "Imagination-based Planner", the
first model-based, sequential decision-making agent that can learn to
construct, evaluate, and execute plans. Before any action, it can perform a
variable number of imagination steps, which involve proposing an imagined
action and evaluating it with its model-based imagination. All imagined actions
and outcomes are aggregated, iteratively, into a "plan context" which
conditions future real and imagined actions. The agent can even decide how to
imagine: testing out alternative imagined actions, chaining sequences of
actions together, or building a more complex "imagination tree" by navigating
flexibly among the previously imagined states using a learned policy. And our
agent can learn to plan economically, jointly optimizing for external rewards
and computational costs associated with using its imagination. We show that our
architecture can learn to solve a challenging continuous control problem, and
also learn elaborate planning strategies in a discrete maze-solving task. Our
work opens a new direction toward learning the components of a model-based
planning system and how to use them.

介绍了一种基于想象的规划器，可以学习构建、评估和执行计划，并可通过学习策略等手段进行多方案模拟，联合优化外部收益和计算成本等目标。