Documentation
Getting Started
1. Install Planetanium
Download the app for your platform and install it.
2. Configure API Keys
Open Settings (gear icon in the top bar) and add your API keys:
| Provider | Purpose | Get a Key |
|---|---|---|
| ElevenLabs | Audio transcription | elevenlabs.io |
| OpenAI | Image generation (GPT Image) | platform.openai.com |
| Fal.ai | Image generation (Flux) | fal.ai |
| xAI | Image generation (Grok-2) | console.x.ai |
| OpenAI | Video animation (Sora) | platform.openai.com |
| Google AI | Video animation (Veo) | aistudio.google.com |
| Fal.ai | Video animation (LTX) | fal.ai |
You only need the keys for the services you want to use. At minimum, you need ElevenLabs (transcription) and one image generation provider.
3. Create Your First Project
- Click New Project
- Upload or record your audio narration
- Click Transcribe — AI will transcribe with word-level timing
- Click Detect Scenes — AI segments your narration into scenes
- Click Generate Images — AI creates visuals for each scene
- Optionally, Animate scenes to create video clips
- Click Export to render the final video
Core Features
Prompt Editor
All generation features open a dialog with full prompt control:
- Detect Characters — Identify and extract characters from your narration
- Detect Scenes — Segment narration into logical scenes
- Generate Description — Create detailed descriptions for scenes
- Generate Shots — Break scenes into individual shots
- Generate Image — Create visuals with customizable prompts
- Animate — Convert images to video clips
Each dialog lets you review and edit the prompt before generation, giving you complete control over AI output.
Transcription
Transcription uses ElevenLabs Scribe v2 which provides:
- Word-level timing for precise synchronization
- Speaker diarization (identifies different speakers)
- High accuracy across languages
When you edit transcription text, your changes are stored as overlays — the original transcription is preserved, and your edits are applied on top.
Shots
Shots are the building blocks of your video:
- Each shot belongs to a scene
- Shots can have character references for consistency
- Shots can be individually animated
- Adjust shot duration and timing on the timeline
Working with Scenes
After scene detection, you can:
- Split or merge scenes by adjusting boundaries on the timeline
- Edit scene text to change the narration segment
- Upload custom images for any scene
- Regenerate individual images with different prompts
Character Consistency
For stories with recurring characters:
- Go to the Characters panel
- Add characters with descriptions
- Upload or generate reference portraits
- Enable character references when generating scene images
Visual Styles
To maintain a consistent look:
- Open the Visual Style panel
- Upload reference images that represent your desired style
- Add a style description (e.g., “watercolor illustration, soft colors”)
- All generated images will follow this style
Export Options
Planetanium offers three export options:
| Option | Description |
|---|---|
| Assets (ZIP) | Export all images, videos, and audio as a ZIP file for use in other tools |
| Movie from Images | Render video using generated images with transitions |
| Movie from Videos | Render video using animated clips |
Export Settings
| Setting | Options |
|---|---|
| Resolution | 480p, 720p, 1080p |
| Frame Rate | 24, 30, 60 fps |
| Transitions | Fade (configurable duration) |
| Format | MP4 (H.264) |
Customizing Prompts
Advanced users can customize the AI prompts used at every stage:
- Open Settings > Prompts
- Edit templates for scene detection, image generation, etc.
- Use template variables like
{{scene_text}}and{{style_description}}
Troubleshooting
Images not generating? Check that your API key is valid and has credits. Try a different model.
Transcription failing? Ensure your ElevenLabs key is active. Check that the audio file is not corrupted.
Export quality issues? Use 1080p resolution and 30fps for best results. Ensure source images are high quality.