While diffusion-based text-to-image (T2I) models provide a simple and powerful way to generate images, guiding this generation remains a challenge. For concepts that are difficult to describe through language, users may struggle to create prompts. Moreover, many of these models are built as end-to-end systems, lacking support for iterative shaping of the image. In response, we introduce PromptPaint, which combines T2I generation with interactions that model how we use colored paints. PromptPaint allows users to go beyond language to mix prompts that express challenging concepts. Just as we iteratively tune colors through layered placements of paint on a physical canvas, PromptPaint similarly allows users to apply different prompts to different canvas areas and times of the generative process. Through a set of studies, we characterize different approaches for mixing prompts, design trade-offs, and socio-technical challenges for generative models. With PromptPaint we provide insight into future steerable generative tools.

通过将T2I生成与模拟使用彩色绘画的交互相结合，PromptPaint使用户能够超越语言来混合表达具有挑战性概念的提示，从而允许他们在图像生成过程的不同画布区域和时间应用不同的提示，并提供了有关未来可操控生成工具的深入洞察。

PromptPaint: 通过绘画媒介般的互动引导文本生成图像