Skip to content

Getting Started with CloviNarrate Narration & Audio

What This Tool Does

CloviNarrate turns your slide narration scripts into professionally voiced MP4 videos without a recording studio, a timeline editor, or a freelance voice artist. You paste the speaking notes for each slide, choose a voice, and receive finished narrated videos ready to upload to your course platform or LMS. It generates high-quality spoken audio from text and automatically syncs that audio to your slide images — no timeline editing required.

Quick Start

  1. Create a project. After signing in, click New Project, give it a name, and upload your slide images as PNG or JPG files.
  2. Add narration text. For each slide, paste the script you want spoken. A live character count updates as you type.
  3. Choose a voice. Open the Voice dropdown and pick from the preset library. Each option includes a short audio sample you can play first.
  4. Click Generate. CloviNarrate sends each slide's script to the AI voice engine and syncs the audio to your image. A per-slide tracker shows progress in real time.
  5. Download your videos. Once complete, download individual per-slide MP4 files or the single combined lesson video from the Export panel.

Core Features

AI Voice Generation

Paste narration text and CloviNarrate converts it to natural-sounding spoken audio using a professional AI voice engine. Voices are available in multiple styles — conversational, authoritative, and instructional. Character counts are shown before each generation so you always know where you stand against your plan limit.

Slide-to-Video Assembly

After audio is generated, CloviNarrate automatically pairs each audio file with its matching slide image and renders a synchronized MP4. No timeline editing or separate video software required. A combined full-lesson video is assembled automatically from all per-slide clips in order.

Per-Slide Regeneration

Select any slide, update its script, and click Regenerate for that slide only — the rest of the project stays intact. Correcting a changed price or a rebranded term takes under a minute instead of days.

SSML Pronunciation Control

For technical content with scientific terms, product names, or abbreviations, CloviNarrate supports SSML — a markup standard that gives the voice engine precise instructions for pronunciation, pause length, and emphasis. Most users never need it, but it is fully available when you do.

Project Storage and Export

All generated audio and video files are stored in your CloviNarrate account and available for re-download within your plan's storage limit. Paid plan exports are clean, unbranded MP4 and MP3 files. Free plan exports include a watermark.

Tips for Best Results

  • Keep slide scripts between 80 and 250 words. Shorter scripts can sound rushed; very long ones introduce inconsistent pacing from the voice engine.
  • Write for the ear, not the eye. Abbreviations obvious on a slide ("Q3 YOY") may be mispronounced. Spell them out in the narration text or use SSML to specify pronunciation.
  • Always end scripts with a period. The voice engine treats punctuation as pacing cues. Text without terminal punctuation runs together and sounds abrupt.
  • Preview voices on your actual content. The demo clips in the voice library use generic text. Test a voice on two or three sentences from your real script before committing it to a full project.
  • Queue all edits before regenerating. If multiple slides need updates after a review pass, update all scripts first, then regenerate the batch together for consistent results.

Frequently Asked Questions

Can I use my own voice or a custom voice recording? Voice cloning is available on Pro plans and above. Supply a clean audio recording of at least two minutes and CloviNarrate creates a private voice profile in your account that appears alongside the preset library.

What file formats do my slides need to be in? Slides must be PNG or JPG. A 16:9 aspect ratio is recommended for video output. Most presentation tools export compatible image files by default. PDF import is not currently supported — export your slides as images first.

What happens if I exceed my monthly character limit? Generation pauses at your plan limit. You will see your remaining character count before each generation. Additional characters are available at a per-character overage rate, or you can upgrade your plan at any time.

Will my script text be used to train AI models? No. Your narration scripts and generated audio are private to your account. CloviNarrate does not use your content for model training or share it with third parties.

Can I export audio only, without video? Yes. In the Export panel, switch to Audio Only mode to download MP3 files per slide with no video assembly — useful for podcast production or importing audio into a separate editing tool.

Usage Limits by Plan

Feature Free Starter Pro Business
Characters per month 15,000 60,000 200,000 600,000
Max slides per project 5 20 60 Unlimited
Team seats 1 1 1 5
Video export quality 720p (watermarked) 720p 1080p 1080p + batch
File storage Local download only 2 GB 10 GB 50 GB
Voice cloning No No Yes Yes
SSML editor Yes Yes Yes Yes
API access No No No Yes