Video Generation
Recently, numerous AGI applications catch the eyes of almost all the people on the internet. Here lists some advanced papers elucidate their key principles and technologies. DiT The authors explore a new class of diffusion models based on the transformer architecture, Diffusion Transformers (DITs)1. Before their work, using a U-Net backbone to generate the target image is prevalent instead of using a transformer architecture. The authors make some experiments with variants of standard transformer blocks that incorporate conditioning via adaptive layer norm, cross-attention and extra input tokens....