Skip to main content

Showing 1–2 of 2 results for author: Lin, K H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.07540  [pdf, other

    cs.CV cs.LG

    Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance

    Authors: Kuan Heng Lin, Sicheng Mo, Ben Klingher, Fangzhou Mu, Bolei Zhou

    Abstract: Recent controllable generation approaches such as FreeControl and Diffusion Self-guidance bring fine-grained spatial and appearance control to text-to-image (T2I) diffusion models without training auxiliary modules. However, these methods optimize the latent embedding for each type of score function with longer diffusion steps, making the generation process time-consuming and limiting their flexib… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 18 pages, 11 figures, see project page at https://genforce.github.io/ctrl-x

  2. arXiv:2312.07536  [pdf, other

    cs.CV

    FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

    Authors: Sicheng Mo, Fangzhou Mu, Kuan Heng Lin, Yanli Liu, Bochen Guan, Yin Li, Bolei Zhou

    Abstract: Recent approaches such as ControlNet offer users fine-grained spatial control over text-to-image (T2I) diffusion models. However, auxiliary modules have to be trained for each type of spatial condition, model architecture, and checkpoint, putting them at odds with the diverse intents and preferences a human designer would like to convey to the AI models during the content creation process. In this… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: Project Page: https://genforce.github.io/freecontrol/