Okay. Say goodbye to the deserts. All of them. Sora & Veo update.
Edition 27: More Google vs. OpenAI warfare as the big loser is ...
Listen to the podcast above or on Spotify.
Okay. False alarm. Mr. Excitable here was under a misguided impression that sand was used in the manufacturing of all of the mighty processing chips we are building future need for. Say goodbye to the Sahara. Honest mistake.
Sure - there are some precious metals and poisons involved in the production process. We’ll stick that at the bottom of the page for now. Of course we’ll need sand for concrete and glass. Homes of the future. Never mind. Reading the room …
Sora, you’ve got another sibling.
We’ll move China’s Vidu to the back burner for the time being. But don’t count them out yet. They were ahead of Open AI early-on.
1. The AI Video Death Match: Veo, Sora, & any other secret projects out there.
Google's Veo, announced at their I/O 2024 conference, represents a significant leap forward in this field. Veo can generate high-quality 1080p videos (Sora780p) from text prompts, surpassing the one-minute limitation of previous models.
Introduced at Google's I/O 2024 by actor/singer/artist Donald Glover, What's "cool" about VEO? "You can make a mistake faster," Glover said in a video shown during Google's I/O 2024 livestream. "That's all you really want at the end of the day — at least in art — is just to make mistakes fast."
At Google I/O 2024, Google introduced two new generative media models in Veo and Imagen 3. Veo is built to generate high-def video while Imagen 3 is a text-to-image model.
Veo is the newest generative media model from Google and is specifically geared toward generating 1080p videos. Google says Veo can make videos longer than a minute, but did not say how much more than a minute.
CTO’s have been shying away from being too specific on content that their systems train on. OpenAI was rather non-specific as well; possibly alluding to YouTube being part of the process.
Here’s my favorite:
It understands cinematic language and can create realistic movement, opening doors for filmmakers and content creators.
Veo: What Works
Cinematic language. That’s a deal maker.
Cinematic Language: Veo's ability to understand terms like "timelapse" or "dutch angle" empowers creators with more granular control over the visual style of their projects.
This opens doors for experimenting with different genres and creating unique cinematic experiences.
It democratizes the tradigital film making process lowering the barrier to entry, and replacing value recently lost by your USC Film degree.
Realistic Movement: Traditional AI-generated video often struggle with depicting natural movement patterns. Veo's advancements promise smoother transitions, believable object physics, and lifelike character animations, making the generated content more immersive.
Long-form Video: Breaking the one-minute barrier with Veo creates exciting possibilities for storytelling. Creators can now envision intricate narratives or detailed product demonstrations without the limitations of shorter formats.
Integration with Existing Workflows: While details are scarce, the possibility of using video prompts suggests Veo might seamlessly integrate into existing video editing workflows. Imagine taking existing footage and using Veo to generate dreamlike sequences or seamlessly extend a scene.
Veo: Still Not Solved
Despite the advancements, generative AI for video will continue to face these challenges:
Bias and Representation: AI models are trained on vast datasets that might reflect societal biases. It's crucial to ensure these models generate fair and inclusive content, representing diverse characters and settings accurately.
Ethical Concerns: The ease of creating realistic videos raises ethical concerns about deepfakes and misinformation. Robust safeguards and digital watermarks are necessary to ensure the responsible use of such technology.
Artistic Control vs. Automation: Generative AI can be a powerful tool, but it shouldn't replace the human element of film making. Finding the right balance between AI automation and artistic control for creators is ongoing.
2. The Future of Generative AI Video: Beyond the Cutting Edge.
OpenAI's Sora: Unveiled in February 2024, Sora is Veo's main competitor. It boasts similar capabilities, generating high-resolution video with a focus on photo realism. OpenAI has the jump on courting Hollywood studios, highlighting the potential of Sora for visual effects and animation.
The emergence of models like Veo represents a significant step forward for AI-powered video creation. Here's a glimpse into what the future might hold:
Widespread Adoption: As technology matures and becomes more accessible, generative AI tools could become commonplace for creators of all levels, democratizing video production.
Evolution of Storytelling: With AI aiding in scene creation and animation, the focus can shift towards developing compelling narratives and crafting.
The Google factor: These guys have a tendency to go all-in. Veo will be around for awhile. With Google playing, look for investor wallets to loosen up a bit in this category.
3. The Generative AI Video Process.
These models work by ingesting prompts and translating them into a sequence of images. This process is complex and involves a deep understanding of natural language processing, computer vision, and video generation techniques. Here's a simplified breakdown:
Feel free to share comments below. If you enjoyed this read, please smash the heart icon …
Prompt Interpretation: Users provide text descriptions, image references, or even existing video snippets as prompts. The AI model analyzes these prompts, identifying key elements like characters, setting, and actions.
Scene Generation: Based on the interpreted prompt, the model generates individual frames, considering factors like lighting, composition, and camera angles.
Motion and Sequence: The generated frames are then stitched together to create a moving sequence with realistic motion patterns for objects, people, and the environment.
Style and Refinement: Advanced models like Veo can incorporate specific cinematic styles, implementing techniques like time-lapse or aerial shots upon user request. Iterations and refinements can be made by providing additional prompts throughout the process.
Feel free to share comments below. If you enjoyed this read, please smash the heart icon at the top or bottom of the page so others can find this article. Re-stacking is great too.
Have a great week people!
4. In case you’re curious about chips, here it is.
Semiconductors: The foundation of processors lies in semiconductors, most commonly silicon wafers. These wafers are meticulously treated with various elements like boron, phosphorus, and arsenic to create the electrical properties needed for transistors, the building blocks of processors.
Rare Earth Elements: Modern processors often utilize rare earth elements like Neodymium and Dysprosium for their unique magnetic properties, crucial for some high-performance transistors.
Metals: Various metals like copper and aluminum are used for electrical wiring and heat dissipation within the processor package.
Other Materials: Depending on the specific processor design, additional materials like ceramics and polymers might be used for packaging and structural support.