Stable Diffusion ControlNet

Repurposing Stable Diffusion Attention for Training-Free Unsupervised Interactive Segmentation

Abstract: Recent progress in interactive point prompt based Image Segmentation allows to significantly reduce the manual effort to obtain high quality semantic labels. State-of-the-art unsupervised ...

IEEE

SynSonic: Augmenting Sound Event Detection through Text-to-Audio Diffusion ControlNet and ...

Abstract: Data synthesis and augmentation are essential for Sound Event Detection (SED) due to the scarcity of temporally labeled data. While augmentation methods like SpecAugment and Mix-up can ...

GitHub

Stable Video Infinity: Infinite-Length Video Generation with Error Recycling

main (this branch): SVI using Wan 2.1 base model (both SVI 1.0/2.0) svi_wan22 branch: SVI using Wan 2.2 base model (both SVI 2.0/2.0 Pro) SVI 2.0 Pro ComfyUI Workflows and Videos from the Community ...

GitHub

Diffusion Transformers with Representation Autoencoders (RAE)

We present Representation Autoencoders (RAE), a class of autoencoders that utilize pretrained, frozen representation encoders such as DINOv2 and SigLIP2 as encoders with trained ViT decoders. RAE can ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果