Indigo的云上小屋

AIGC论文阅读——Stylized Generation系列

StyleAligned StyleAligned做的事情给定一张reference image，无需训练或者微调，得到一组style-consistent的生成图像。 StyleAligned的做法 StyleAligned将每个self-attention layer更改为shared attention layer，如下图：将每个batch中的第一张图片作为ref...

2024-11-11 论文阅读AIGC

阅读全文

AIGC论文阅读——FLUX

2024-11-04 论文阅读AIGC

阅读全文

AIGC论文阅读——PixArt系列

PixArt-alphaPixArt-sigma

2024-11-04 论文阅读AIGC

阅读全文

AIGC论文阅读——Stable Diffusion 3

2024-11-04 论文阅读AIGC

阅读全文

AIGC论文阅读——DiT及其变种

DiT DiT的主要结构 DiT是从LDM改进而来的，将LDM中的U-Net替换为Transformer，VAE仍然使用Stable Diffusion的。最后一个DiT block之后，将tokens decode为output noise prediction和covariance prediction。 DiT的Patchify过程与ViT的一样，使用卷积将压缩后的...

2024-11-04 论文阅读AIGC

阅读全文

AIGC论文阅读——Consistency Model系列

Consistency ModelLatent Consistency Model(LCM)

2024-11-01 论文阅读AIGC

阅读全文

AIGC论文阅读——Flow matching系列

Flow MatchingRectified FlowRectified Flow的基本思想用模型去学习两个分布数据点之间的变化速度v，通过xt+1=xt+vt,t∈(0,1)x_{t+1}=x_t+vt,t\in (0,1) xt+1=xt+vt,t∈(0,1)，以实现将源数据转换为目标数据，v通过神经网络学习，并通过多次ReFlow（用训练好的模型生成配对数据进行第二轮的模型训...

2024-11-01 论文阅读AIGC

阅读全文

AIGC论文阅读——Controllable系列

ControlNet ControlNet可以添加的控制条件类型 Canny Edge, Hough lines, User scrobbles, Human Keypoints, Segmentation maps, shape normals ,depth. 多个条件可以叠加，将controlnet的结果相加即可。 ControlNet的结构会将SD的Unet的Encod...

2024-10-31 论文阅读AIGC

阅读全文

AIGC论文阅读——Editing系列

Prompt-to-Prompt Prompt-to-Prompt做的事情无需mask，仅通过文字prompt，对生成图片进行编辑，且能够保持除编辑区域之外的一致性。该过程中无需任何训练、微调，只需要调整Attention map。 Prompt-to-Prompt的动机作者发现生成图片的空间布局和几何形状都是由内部的cross-attention层的attention ma...

2024-10-31 论文阅读AIGC

阅读全文

AIGC论文阅读——Personalization系列

Textual Inversion Texual Inversion做的事情给定3-5张某个主体的图片，冻结T2I模型，通过优化一个pseudo-word的embedding来使得模型学会生成该主体。在2张V100上优化了5000个steps。 Textual Inversion的优化流程将placeholder的embedding vector设置为可学习的embeddi...

2024-10-31 论文阅读AIGC

阅读全文