模型:

camenduru/potat1

中文

? Please follow me for new updates https://twitter.com/camenduru ? Please join our discord server https://discord.gg/k5BwmmvJJU

Potat 1️⃣

First Open-Source 1024x576 Text To Video Model ?

https://huggingface.co/vdo/potat1-5000/tree/main https://huggingface.co/vdo/potat1-10000/tree/main https://huggingface.co/vdo/potat1-10000-base-text-encoder/tree/main https://huggingface.co/vdo/potat1-15000/tree/main https://huggingface.co/vdo/potat1-20000/tree/main https://huggingface.co/vdo/potat1-25000/tree/main https://huggingface.co/vdo/potat1-30000/tree/main https://huggingface.co/vdo/potat1-35000/tree/main https://huggingface.co/vdo/potat1-40000/tree/main https://huggingface.co/vdo/potat1-45000/tree/main https://huggingface.co/vdo/potat1-50000/tree/main https://huggingface.co/vdo/potat1-50000-base-text-encoder/tree/main = https://huggingface.co/camenduru/potat1 (you are here)

Info

Prototype Model Trained with https://lambdalabs.com ❤ 1xA100 (40GB) 2197 clips, 68388 tagged frames ( salesforce/blip2-opt-6.7b-coco ) train_steps: 10000

Dataset & Config

https://huggingface.co/camenduru/potat1_dataset/tree/main

Finetuning

https://github.com/Breakthrough/PySceneDetect https://github.com/ExponentialML/Video-BLIP2-Preprocessor https://github.com/ExponentialML/Text-To-Video-Finetuning https://github.com/camenduru/Text-To-Video-Finetuning-colab

Base Model

https://huggingface.co/damo-vilab/modelscope-damo-text-to-video-synthesis https://www.modelscope.cn/models/damo/text-to-video-synthesis

Thanks to damo-vilab ExponentialML kabachuha @DiffusersLib @LambdaAPI @cerspense @CiaraRowles1 @p1atdev_art

Thanks to Orellius ❤ (important bug report)

Please try it ? https://github.com/camenduru/text-to-video-synthesis-colab

Potat 2️⃣ is in the oven ♨