Want to create interactive content? It’s easy in Genially!
Avatar Technologies
Tim Piatenko
Created on October 29, 2025
Start designing with a free template
Discover more than 1500 professional designs like these:
View
Modern Presentation
View
Terrazzo Presentation
View
Colorful Presentation
View
Modular Structure Presentation
View
Chromatic Presentation
View
City Presentation
View
News Presentation
Transcript
Avatar Technologies
Generative AI Research and POC
Overview
With emergence of generative AI technologies for video, we wanted to research the current landscape, understand the various capabilities and limitations, costs, applicability to current roadmap, the level of effort required to integrate into products, and potential future directions. This is a summary of our findings with examples of inputs and outputs.
( Reference for criteria to use when evaluating various products: )
Original POC Structure
Original Video
We start with the current BombBomb business, and assume we have an existing video from a client
Direct AI feedback
We want to see if there's a way to get direct feedback with suggested improvements from AI
Full AI overhaul
Alternatively, can we submit the video to an AI platform and have it re-generate it based on best practices?
Avatar generated
And finally, can we apply digital replica / Avatar technologies to redo the parts of the video we want altered, and stitch the results back into the original seemlessly?
Roadmap features to consider
Can we use gen-AI technologies and tools to personalize parts of an original video at scale, changing names, titles, or locations dynamically for each client?
Personalization
Fallback videos
Can we use Avatar tech to record "fallback" videos that would play when none of the personalization options work?
Remove filler words
Can AI help us remove unwanted content in a video by removing filler words and eliminating awkward pauses?
Persistent eye contact
Can we fix inconsistent eye contact in a pre-recorded video using AI?
Bifurcation of video AI tech
Avatar = Create your digital replica, tell it what to say and how to say it
Generative = You tell us what you want, we try to make it happen
Bifurcation of video AI tech
- Prompt-based
- Flexible
- Halicination-prone
- Designed to work on the whole video
- Structured
- Separates Text and Video
- Input-dependent
- Modular
Examples
Walkthrough
Let's start with a video provided by our very own Kevin Andrews and see what we can do
Walkthrough
We can use simple tools like CV2 and MoviePy in Python + Assembly AI + Creatomate to extract frames, extract the audio and transcribe it, trim the clip to get the parts we want and recombine edits with transition effects
Hey, John, it is Kevin here. Hey. I wanted to reach out to you because I know you are a manager of a sales team, and if you're anything like the other managers I've been talking with, you guys are having a hard time right now. You've invested a lot with the technologies, with the team members that you have, but for some reason it's just not working and it's hard to figure out why. Well, I would love to spend a little bit of time and show you why video is helping people not replace their other technologies, but enhance so that the things you've already invested in, the team you've already invested in, are working the best way possible. Schedule some time with me. Would love to talk more.
Walkthrough
Now we can do a number of things with these:• Audio can be used to clone Kevin's voice using e.g. ElevenLabs • Text can be imported into e.g. Python and customized programmatically • An individual frame is enough to generate a simple "talking head" consent video
Walkthrough
We can now generate a replica / avatar for Kevin in e.g. Tavus, and feed the new updated script to it
Walkthrough
But ideally, we would only need the part of the clip, where the names and titles are. Customize that and stitch is back in, using e.g. Creatomate
Issues
- AI-generated clips are usually at least 2 sec long
- Requires some prompt engineering
- Need to match FPS and resolution
- Not hard, just extra processing
- Need to match transition to original
- Can be tricky...
Solution 1
- Use pause identification routines to find better places to splice the original
- Use First and Last frame AI technology to generate the filler with natural motion
- Tweak to match better
- Pick better F&L frames
- Upscale still frame images
- Speed up the filler segment
- Use a combination of techniques and APIs
- MoviePy for manipulating the clips
- extract audio, cut, recombine
- AssemblyAI for transcription
- Cutout or similar for image upscaling etc
- ffmpeg for pause identification
- Vidu for F&L frame → Video filler
Solution 2
- Extract audio from the original (MoviePy)
- Transcribe the audio (AssemblyAI)
- Personalize the script where applicable
- Can use an LLM to identify words or phrases (e.g. names, places, and titles)
- Use a voice cloning service (e.g. Vidu) to regenerate the audio
- Use lip-syncing tool (e.g. Vidu) to match the new audio to the orignal video
Example of using an LLM to analyze the original script for personalization potential →
Solutions pros and cons
Slicing and dicing - Modular approach
- minimize reprocessing
- remove filler words
- remove awkward pauses
- More work to identify good cut points
- More work on recombination
- minimize reprocessing
- remove filler words
- remove awkward pauses
Redo the whole thing - No need for any matching
- AI for lip-sync is cheaper than full regeneration
- Fewer steps in general
- Can't remove anything, only substitute with similar length words
Reimagine?
And finally, we can feed the resulting video to a video generation / reimagining / modification AI platform e.g. Runway
Full Avatar generated content
Video → Frame → Consent → Avatar / Replica → Video
Personalization
Video → Audio → Text → Customized → Video
So far:
Fallback content
Just add logic to your Python code
AI reimagined
Need to be very careful with prompt...
...
But what about AI feedback?
Getting feedback on existing content is quite difficult... However, Getting live coaching and analysis using an Avatar-driven Agent or a "Persona" is possible!
Resources
Code repository
POC Overview
Technical artifacts
POC Outcomes