Recently, I had a big goal. I wanted to build an app that had a character. Not just any character. I wanted one that could talk with a real-sounding voice, and I wanted its lips to move perfectly with the words.
Source: https://monkcubed.com/
Try the MonkCubed transformation program here >>>
Sounds hard, right? I thought so too. But after a few late nights (and a lot of coffee ☕), I figured it out. I used two amazing tools: ElevenLabs for the voice and CapCut for the lip sync.
The best part? You do not need to be a tech wizard. If you are 12 years old and love computers, you can do this. Let me show you exactly how I did it.

Why Bother with a Talking Face?
Before we start, let me tell you why this is so cool. When users open my app, they see a face. When that face talks to them, they listen longer. It feels like a real person is helping them. It makes my app feel smart and friendly.
Plus, using a custom voice (not the boring robot voice) makes my brand special.
The Two Super Tools
- ElevenLabs: This is a website that makes AI voices sound like real humans. You can even clone a voice or create a totally new one.
- CapCut: This is a free video editor (you might know it from TikTok). It has a secret power: auto lip sync. It can take a face and a voice and make the mouth move perfectly.
Step 1: Creating Your Custom Voice in ElevenLabs
First, you need a voice. Do not just use the basic one. Let’s make it yours.
Here is what I did:
- Go to ElevenLabs: I signed up for a free account. (They give you some free characters to start).
- Click on “Voice Lab”: This is where the magic happens.
- Click “Add new voice”: You have two choices:
- Instant Voice Clone: You record yourself saying a few sentences. The AI copies YOU.
- Professional Voice Clone: You need a clean studio recording.
- My tip: For my app, I used “Voice Design.” I picked “Friendly Male” and adjusted the “Stability” and “Clarity” sliders until it sounded like a kind cartoon teacher.
- Test the voice: I typed “Hello, welcome to my app!” and pressed play. I smiled so big when it sounded real.
- Save it: Name your voice (e.g., “My App Buddy”).
➡️ Want to make a voice that sounds just like a real person? Click here to sign up for ElevenLabs and get started for FREE! TRY IT! ⬅️
Step 2: Making the Voice Say Your App’s Script
Now, we need the voice to say the exact words we want.
- Go to “Speech Synthesis” in ElevenLabs.
- Pick your custom voice from the dropdown menu.
- Type your script. For my app, I typed: “Hi there! I am your AI helper. Let’s learn something new today.”
- Adjust the settings:
- Stability: Make this higher (like 70%) so the voice sounds steady.
- Similarity Boost: Turn this up to 75% so it matches your custom sound.
- Click “Generate”.
- Download the audio. Click the three little dots (⋮) and hit “Download Audio”. You will get an MP3 file. Save it somewhere easy to find, like your Desktop.
Step 3: Getting a Face to Talk (Lip Sync with CapCut)
You have the voice (the MP3). Now you need a face. I drew a simple cartoon character in Canva. You can also use a photo of a person (but be careful with copyrights!).
Here is the secret lip sync trick using CapCut:
- Download CapCut (it is free on your computer or phone).
- Open a new project.
- Import your face picture: Click “Import” and choose your character’s photo.
- Drag the photo onto the timeline (the big empty area at the bottom).
- Import the voice MP3 you made in ElevenLabs. Drag that onto the timeline too, right below your photo.
Now watch this – it’s like magic:
- Click on your photo on the timeline so it turns white.
- Look at the menu on the right side of CapCut. Find the tab that says “Video” (or “Basic”).
- Scroll down until you see a button that says “Lip Sync” . (In some versions, it is under “Auto Cutout” or “Motion”—just look for the mouth icon).
- Click “Lip Sync.” A little box will pop up.
- Select the audio source: Choose your MP3 track from the timeline.
- Click “Generate” (or “Start”).
Wait for 5-10 seconds. Watch as your still picture starts to move! The mouth opens and closes perfectly with your ElevenLabs voice. It looks alive!
Ready to bring your drawings to life? Download CapCut for free and try Lip Sync today! TRY CAP CUT HERE! ⬅️
Step 4: Making It Look Professional (Pro Features in CapCut)
Just lip sync is cool, but I wanted my app to look amazing. Here are the extra CapCut features I used:
Feature 1: Remove Background
I did not want a square photo. I wanted just my character.
- Click on your photo.
- Go to “Cutout” > “Auto Cutout”.
- CapCut removes the background in 1 second. Now my character floats!
Feature 2: Add Blinking Eyes
A talking face that never blinks is creepy.
- Search in the “Stickers” tab for “Eyes blink”.
- Place the blink sticker over your character’s eyes every 3 seconds. (Or use the “Animation” tab to make the photo fade in and out to look like breathing).
Feature 3: Text Captions
Not everyone listens with sound.
- Click “Text” > “Auto Captions”.
- CapCut listens to your ElevenLabs MP3 and types the words automatically.
- Change the font to something fun and big.
Feature 4: Zoom & Move
Make the video less boring.
- Click on your character.
- Add a keyframe at the start (a diamond icon).
- Add another keyframe at the end and zoom in 10%.
- Now the camera slowly moves closer. It feels like the character is leaning in to talk to you.
Step 5: Exporting for Your App
You have a talking, blinking, lip-syncing character. Now let’s get it ready for your app.
- Click the “Export” button in the top right of CapCut.
- Change the resolution to 1080p (that is high quality).
- Change the frame rate to 30 fps.
- Make sure “Remove background” is checked if you want a transparent video.
- Click “Export”.
Now you have a video file (MP4). You can put this video inside your app. Every time a user opens a page, the video plays, and your custom ElevenLabs voice talks to them.
The Big Result
After I did these steps, my app changed completely. Users told me, “Wow, this feels like a game!” They stayed on the app 3x longer.
The best part? I can make a new video in 10 minutes.
- Type new script in ElevenLabs.
- Download new MP3.
- Drop MP3 into CapCut.
- Click “Lip Sync”.
- Export.
That is it. Five steps. I have made over 50 different talking characters for my app now. Some are teachers. Some are funny. One is a talking dog (don’t ask).
Your Turn!
You do not need to be a coder. You do not need expensive cameras. You just need ElevenLabs for the perfect voice and CapCut for the magic lip sync.
Imagine your own app with a custom character that talks directly to your users. It will make you look like a pro.
So go ahead. Draw a face. Record a voice. Make it talk.
🚀 Stop dreaming and start building! 🚀
Sign up for ElevenLabs to create your unique voice here
Download CapCut to lip sync your character here
Happy creating, friends! Let me know when your talking app goes live. 😎
Related Post: