We all know the story of the first YouTube video, a grainy 19-second clip of co-founder Jawed Karim at the zoo, remarking on the elephants behind him. That video was a pivotal moment in the digital space, and in some ways, it is a reflection, or at least an inverted mirror image, of today as we digest the arrival of Veo 3.
Part of Google Gemini, Veo 3 was unveiled at Google I/O 2025 and is the first generative video platform that can, with a single prompt, generate a video with synced dialogue, sound effects, and background noises. Most of these 8-second clips arrive in under 5 minutes after you enter the prompt.
Iāve been playing with Veo 3 for a couple of days, and for my latest challenge, I tried to go back to the beginning of social video and that YouTube āMe at the Zooā moment. Specifically, I wondered if Veo 3 could recreate that video.
As Iāve written, the key to a good Veo 3 outcome is the prompt. Without detail and structure, Veo 3 tends to make the choices for you, and you usually donāt end up with what you want. For this experiment, I wondered how I could possibly describe all the details I wanted to derive from that short video and deliver them to Veo 3 in the form of a prompt. So, naturally, I turned to another AI.
Google Gemini 2.5 Pro is not currently capable of analyzing a URL, but Google AI Mode, the brand-new form of search that is quickly spreading across the US, is.
Hereās the prompt I dropped into Googleās AI Mode:
[IMG alt=āAI Mode URL analysisā]https://cdn.mos.cms.futurecdn.net/Yd...dptN6wEVxE.png
(Image credit: Future)
Google AI Mode almost instantly returned with a detailed description, which I took and dropped into the Gemini Veo 3 prompt field.
I did do some editing, mostly removing phrases like āThe video appearsā¦ā and the final analysis at the end, but otherwise, I left most of it and added this at the top of the prompt:
āLetās make a video based on these details. The output should be 4:3 ratio and look like it was shot on 8MM videotape.ā
It took a while for Veo 3 to generate the video (I think the service is getting hammered right now), and, because it only creates 8-second chunks at a time, it was incomplete, cutting off the dialogue mid-sentence.
Still, the result is impressive. I wouldnāt say that the main character looks anything like Karim. To be fair, the prompt doesnāt describe, for instance, Karimās haircut, the shape of his face, or his deep-set eyes. Googleās AI Modeās description of his outfit was also probably insufficient. Iām sure it would have done a better job if I had fed it a screenshot of the original video.
Note to self: You can never offer enough detail in a generative prompt.
[HEADING=1]8 seconds at a time[/HEADING]
The Veo 3 video zoo is nicer than the one Karim visited, and the elephants are much further away, though they are in motion back there.
Veo 3 got the film quality right, giving it a nice 2005 look, but not the 4:3 aspect ratio. It also added archaic and unnecessary labels at the top that thankfully disappear quickly. I realize now I should have removed the āTitleā bit from my prompt.
The audio is particularly good. Dialogue syncs well with my main character and, if you listen closely, youāll hear the background noises, as well.
The biggest issue is that this was only half of the brief YouTube video. I wanted a full recreation, so I decided to go back in with a much shorter prompt:
Continue with the same video and add him looking back at the elephants and then looking at the camera as heās saying this dialogue:
āfronts and thatās thatās cool.ā āAnd thatās pretty much all there is to say.ā
Veo 3 complied with the setting and main character but lost some of the plot, dropping the old-school grainy video of the first generated clip. This means that when I present them together (as I do above), we lose considerable continuity. Itās like a film crew time jump, where they suddenly got a much better camera.
Iām also a bit frustrated that all my Veo 3 videos have nonsensical captions. I need to remember to ask Veo 3 to remove, hide, or put them outside the video frame.
I think about how hard it probably was for Karim to film, edit, and upload that first short video and how I just made essentially the same clip without the need for people, lighting, microphones, cameras, or elephants. I didnāt have to transfer footage from tape or even from an iPhone. I just conjured it out of an algorithm. We have truly stepped through the looking glass, my friends.
I did learn one other thing through this project. As a Google AI Pro member, I have two Veo 3 video generations per day. That means I can do this again tomorrow. Let me know in the comments what youād like me to create.
[HEADING=2]You might also like[/HEADING]
[ul]
[li]I created these wild AI videos in Veo 3 and hereās how you can do it too[/li][li]Google I/O 2025 as it happened: AI Search, Veo, Flow ā¦[/li][li]The 13 biggest announcements from Google I/O 2025[/li][li]Googleās Veo 3 marks the end of AI videoās āsilent eraā[/li][/ul]
Continue readingā¦
Part of Google Gemini, Veo 3 was unveiled at Google I/O 2025 and is the first generative video platform that can, with a single prompt, generate a video with synced dialogue, sound effects, and background noises. Most of these 8-second clips arrive in under 5 minutes after you enter the prompt.
Iāve been playing with Veo 3 for a couple of days, and for my latest challenge, I tried to go back to the beginning of social video and that YouTube āMe at the Zooā moment. Specifically, I wondered if Veo 3 could recreate that video.
As Iāve written, the key to a good Veo 3 outcome is the prompt. Without detail and structure, Veo 3 tends to make the choices for you, and you usually donāt end up with what you want. For this experiment, I wondered how I could possibly describe all the details I wanted to derive from that short video and deliver them to Veo 3 in the form of a prompt. So, naturally, I turned to another AI.
Google Gemini 2.5 Pro is not currently capable of analyzing a URL, but Google AI Mode, the brand-new form of search that is quickly spreading across the US, is.
Hereās the prompt I dropped into Googleās AI Mode:
[IMG alt=āAI Mode URL analysisā]https://cdn.mos.cms.futurecdn.net/Yd...dptN6wEVxE.png
(Image credit: Future)
Google AI Mode almost instantly returned with a detailed description, which I took and dropped into the Gemini Veo 3 prompt field.
I did do some editing, mostly removing phrases like āThe video appearsā¦ā and the final analysis at the end, but otherwise, I left most of it and added this at the top of the prompt:
āLetās make a video based on these details. The output should be 4:3 ratio and look like it was shot on 8MM videotape.ā
It took a while for Veo 3 to generate the video (I think the service is getting hammered right now), and, because it only creates 8-second chunks at a time, it was incomplete, cutting off the dialogue mid-sentence.
Still, the result is impressive. I wouldnāt say that the main character looks anything like Karim. To be fair, the prompt doesnāt describe, for instance, Karimās haircut, the shape of his face, or his deep-set eyes. Googleās AI Modeās description of his outfit was also probably insufficient. Iām sure it would have done a better job if I had fed it a screenshot of the original video.
Note to self: You can never offer enough detail in a generative prompt.
[HEADING=1]8 seconds at a time[/HEADING]
The Veo 3 video zoo is nicer than the one Karim visited, and the elephants are much further away, though they are in motion back there.
Veo 3 got the film quality right, giving it a nice 2005 look, but not the 4:3 aspect ratio. It also added archaic and unnecessary labels at the top that thankfully disappear quickly. I realize now I should have removed the āTitleā bit from my prompt.
The audio is particularly good. Dialogue syncs well with my main character and, if you listen closely, youāll hear the background noises, as well.
The biggest issue is that this was only half of the brief YouTube video. I wanted a full recreation, so I decided to go back in with a much shorter prompt:
Continue with the same video and add him looking back at the elephants and then looking at the camera as heās saying this dialogue:
āfronts and thatās thatās cool.ā āAnd thatās pretty much all there is to say.ā
Veo 3 complied with the setting and main character but lost some of the plot, dropping the old-school grainy video of the first generated clip. This means that when I present them together (as I do above), we lose considerable continuity. Itās like a film crew time jump, where they suddenly got a much better camera.
Iām also a bit frustrated that all my Veo 3 videos have nonsensical captions. I need to remember to ask Veo 3 to remove, hide, or put them outside the video frame.
I think about how hard it probably was for Karim to film, edit, and upload that first short video and how I just made essentially the same clip without the need for people, lighting, microphones, cameras, or elephants. I didnāt have to transfer footage from tape or even from an iPhone. I just conjured it out of an algorithm. We have truly stepped through the looking glass, my friends.
I did learn one other thing through this project. As a Google AI Pro member, I have two Veo 3 video generations per day. That means I can do this again tomorrow. Let me know in the comments what youād like me to create.
[HEADING=2]You might also like[/HEADING]
[ul]
[li]I created these wild AI videos in Veo 3 and hereās how you can do it too[/li][li]Google I/O 2025 as it happened: AI Search, Veo, Flow ā¦[/li][li]The 13 biggest announcements from Google I/O 2025[/li][li]Googleās Veo 3 marks the end of AI videoās āsilent eraā[/li][/ul]
Continue readingā¦