Icon

DreaMontage

Arbitrary Frame-Guided One-Shot Video Generation

Jiawei Liu* Junqiao Li* Jiangfan Deng* Gen Li* Siyu Zhou Zetao Fang Shanshan Lao Zengde Deng Jianing Zhu Tingting Ma Jiayi Li Yunqiu Wang Qian He Xinglong Wu

* Equal contribution, Corresponding author

Intelligence Creation Team, ByteDance

Research Paper

Flexible Dreams, Seamless Montage

A short demo video by DreaMontage

Video Generation from Multi-keyframe-condition
Our model can generate videos with given keyframes placed at specified positions. Hover over the video to reveal the insertion time and input prompt.
Loading...
0:00 / 0:00
insert_seconds: [[0, 0], [5, 5], [10, 10], [15, 15]] prompt: [0, 10] Train moves forward, static window frame; [10, 15] Window shatters into digital fragments; [15, 20] Camera flies through the window into a future cyberpunk city.
Loading...
0:00 / 0:00
insert_seconds: [[0, 0], [2, 2], [7, 7], [12, 12]] prompt: [0, 2] Droplet turns into a swan; [2, 7] Swan turns into a mechanical pocket watch; [7, 12] Mechanical pocket watch turns into a mechanical rose.
Loading...
0:00 / 0:00
insert_seconds: [[0, 0], [5, 5], [10, 10]] prompt: [0, 5] An ape approaches the cave and gradually stands upright; [5, 10] The primitive human walks out of the cave, starts riding a horse, becomes a soldier.
Loading...
0:00 / 0:00
insert_seconds: [[0, 0], [5, 5], [10, 10]] prompt: [0, 5] The camera follows the dandelion; [5, 10] The dandelion dives into the deep sea and transforms into a glowing jellyfish.
Loading...
0:00 / 0:00
insert_seconds: [[0, 0], [5, 5], [10, 10]] prompt: [0, 5] The camera pushes forward, then dives downward; [5, 10] The camera pushes in, dives downward, and enters the cabin through the doorway.
Loading...
0:00 / 0:00
insert_seconds: [[0, 0], [5, 5], [10, 10], [15, 15], [20, 20]] prompt: [0, 5] Zoom into extreme close-up of a man's eye; camera flies through the right pupil to reveal a man walking fast; [5, 10] Man walks forward; a golden butterfly enters. He follows it with his gaze until its wings fully obscure the lens; [10, 15] Butterfly flies away revealing a golden meadow; a boy runs in to chase it; [15, 20] The boy turns around, faces the camera directly, smiling.
Video Generation from Video-condition (Transition)
Our model can seamlessly connect multiple videos. Hover over the video to reveal the insertion time and input prompt.
Loading...
0:00 / 0:00
insert_seconds: [[0, 3], [7, 10]] prompt: [3, 7] Camera follows a man skiing. The snow ahead transitions into the sea. His snowboard transforms into a yellow surfboard, and he starts surfing.
Loading...
0:00 / 0:00
insert_seconds: [[0, 2], [4, 6], [8, 10]] prompt: [0, 10] seamless transition.
Loading...
0:00 / 0:00
insert_seconds: [[0, 2], [6, 8]] prompt: [2, 6] Seamless Character Outfit Change.
Loading...
0:00 / 0:00
insert_seconds: [[0, 2], [4, 6], [8, 10]] prompt: [2, 4] Camera push-in; [6, 8] The model puts down the silver ball, and the camera pulls back.
Loading...
0:00 / 0:00
insert_seconds: [[0, 2], [4, 6], [8, 10]] prompt: [0, 10] seamless transition.
Loading...
0:00 / 0:00
insert_seconds: [[0, 2], [4, 6], [8, 10]] prompt: [0, 10] seamless transition.
Video Generation from Mixed Image-video condition
Our model can insert images or videos at any user-specified timestamp. Hover over the video to reveal the insertion time and input prompt.
Loading...
0:00 / 0:00
insert_seconds: [[0, 0], [3, 5], [10,12]] prompt: [0, 3] The man takes off his helmet; [5, 10] The man rides the motorcycle into the sky, flies into outer space, and morphs into an astronaut.
Loading...
0:00 / 0:00
insert_seconds: [[0, 0], [5, 5], [10, 12]] prompt: [0, 5] Seed sprouts and grows into a pine tree; [5, 10] Camera orbits pine tree as it transforms into a Christmas tree.
Loading...
0:00 / 0:00
insert_seconds: [[0, 0], [5, 8], [12, 12]] prompt: [0, 5] A squirrel slides down the ice slide and runs onto the grass; [8, 12] The little squirrel keeps running, comes to the treehouse, and picks up a pinecone.
Video Generation from Last-frame-condition
Our model can generate videos ending with a given image. Hover over the video to reveal the insertion time and input prompt.
Loading...
0:00 / 0:00
insert_seconds: [[5, 5]] prompt: [0, 5] An airplane flies quickly across the sky, leaving a smoke trail that spells out the word "Travel".
Loading...
0:00 / 0:00
insert_seconds: [[5, 5]] prompt: [0, 5] A beam scans upward from the foundation, rapidly constructing the building layer by layer from bottom to top.
Loading...
0:00 / 0:00
insert_seconds: [[5, 5]] prompt: [0, 5] A vertical assembly process. On a wooden table, a bun base, patty, cheese, lettuce, and tomatoes fall from above in sequence.
Video Generation from Video-condition (Extension)
Our model can extend videos. Hover over the video to reveal the insertion time and input prompt.
Loading...
0:00 / 0:00
insert_seconds: [[0, 3]] prompt: [3, 10] A horse walks in. The kitten jumps from a bike onto the horse's back. Camera pans left; the kitten rides the horse to a stream.
Loading...
0:00 / 0:00
insert_seconds: [[0, 3]] prompt: [3, 8] Static shot. Snow falling. Gingerbread man dances, swaying left and right.
Loading...
0:00 / 0:00
insert_seconds: [[0, 5]] prompt: [5, 10] Two people hugging each other.

Ethical Considerations

The insertion condition images and videos used in these examples are sourced from publicly available channels or generated by models, and are intended solely to demonstrate the capabilities of this research. If there are any concerns, please contact us (lijunqiao.123@bytedance.com) and we will remove the relevant examples in time.

BibTeX

@misc{liu2025dreamontagearbitraryframeguidedoneshot,
            title={DreaMontage: Arbitrary Frame-Guided One-Shot Video Generation}, 
            author={Jiawei Liu and Junqiao Li and Jiangfan Deng and Gen Li and Siyu Zhou and Zetao Fang and Shanshan Lao and Zengde Deng and Jianing Zhu and Tingting Ma and Jiayi Li and Yunqiu Wang and Qian He and Xinglong Wu},
            year={2025},
            eprint={2512.21252},
            archivePrefix={arXiv},
            primaryClass={cs.CV},
            url={https://arxiv.org/abs/2512.21252}, 
            }