Secrets AI Video Generator: How It Works, Quality, and Cost
Most AI companion platforms generate static images. A handful offer voice. Almost none offer video — and this is what makes Secrets AI's video generation feature worth examining in detail rather than treating as a footnote. Turning a companion image into a short motion clip is something Character.AI, CrushOn AI, and Janitor AI cannot do. Secrets AI can, and the output quality (rated 4.1/5 by reviewers) is good enough to be genuinely useful rather than just technically present.
This page covers the complete picture: how the feature works step by step, the real Moments cost per clip, how many videos you can realistically generate at each tier, quality expectations, and whether the investment is worth it for your usage pattern.
For context on how video fits into the broader platform, the full platform review covers all features together.
A Genuinely Rare Feature
Before the mechanics: it is worth establishing why this matters. The AI companion category has expanded significantly since 2024, but video generation remains rare. A survey of the main competitors:
- Character.AI: No video generation
- CrushOn AI: No video generation
- Janitor AI: No video generation
- Candy AI: Limited video (less capable than Secrets AI's implementation)
- GirlfriendGPT: No video generation
- PocketGirl AI: No video generation
The only platforms with comparable video capabilities in the mainstream tier are SweetDream AI and Xotic AI (which generates 4K 15-second clips). Secrets AI's video generator is a real differentiator — not a marketing claim about a barely functional feature.
This also explains the platform's positioning: if video generation from AI companions is a priority, leaving Secrets AI for an alternative almost certainly means losing this capability.
How the Video Generator Works
The process has four steps:
Step 1 — Generate or select a companion image. Video generation starts with a static image. You can use a previously generated image or create a new one specifically for video conversion. Image quality directly affects video output quality — start with a clear, well-rendered source image.
Step 2 — Write a prompt describing the desired motion. The text prompt instructs the AI on what movement or action you want. Good prompts are specific about the type of motion ("gentle head turn," "slow smile," "hair moving in wind") without being so complex that the AI struggles to execute them coherently.
Step 3 — Wait for processing. The AI processes the request in approximately 2 minutes. This is not instantaneous — plan your Moments spending around the processing time, especially if you're generating multiple clips in sequence.
Step 4 — Review and save. The completed video clip appears in your conversation. Quality can be assessed, and the clip can be saved to your device.
Short clips run 3 seconds (available from Lite tier); longer clips are available on Plus and above. The character's visual identity is maintained through the clip — the companion looks like themselves throughout, not just in the first frame.
Quality Assessment
Reviewers rate Secrets AI video quality at 4.1/5. Translating that number to specific characteristics:
- Motion smoothness: Good. Clips do not stutter or show frame-rate issues in most outputs.
- Character consistency: Good. The companion's face, features, and visual identity carry through the video.
- Facial expressions: Natural in most outputs. Subtle expression changes (smile widening, eyes shifting) render correctly in the majority of clips.
- Prompt fidelity: Moderate. Simple, clear prompts (a single type of movement) produce better results than complex multi-action prompts.
- Edge cases: Occasional artifacts on complex prompts. Hand and finger motion (as with static images) can show inconsistency in certain clip types.
The 4.1/5 score is honest — this is not perfect AI video, but it is meaningfully better than the "technically functional" category of barely-working features. Most users describing their video experience use language like "looks good and moves smoothly most of the time" — which aligns with the score.
Quality improves with the Premium and Advanced generation models available on higher tiers. Free and Lite tier video output uses standard models; Premium subscribers access better generation quality.
Moments Cost by Clip Type
| Clip Type | Moments Cost | Generation Time |
|---|---|---|
| Short clip (3 seconds) | ~50 Moments | ~2 minutes |
| Full/longer clip | ~600 Moments | ~2 minutes |
The 12x Moments difference between a short clip and a full clip is significant. For budget planning: a single full-length video clip costs the same as 12-24 static images. If you primarily want to see companion movement without needing extended clips, short clips provide meaningful value at 50 Moments.
Monthly Video Budget by Tier
| Tier | Monthly Moments | Short Clips (50M) | Full Clips (600M) |
|---|---|---|---|
| Lite | 1,000 | ~20 | ~1-2 |
| Plus | 3,000 | ~60 | ~5 |
| Premium | 8,000 | ~160 | ~13 |
| Ultimate | 15,000 | ~300 | ~25 |
These are maximums if you spent all Moments on video. Realistic mixed-use patterns (text messages, some images, some voice) will reduce these figures. On Plus with mixed use, expect 2-4 full-length clips per month alongside other media and text.
For users focused primarily on video generation, Premium ($19.99/month) is the minimum tier where the Moments allocation makes regular clip generation practical — 13 full clips per month as a ceiling, with room for images and text alongside it. Ultimate ($39.99) supports 25+ full clips and is the appropriate tier for daily video users.
Additional Moments can be purchased if you exceed your monthly allocation: bundles start at 1,980 Moments for $5.99. For context on how video Moments compare to the full platform cost structure, see the Moments and pricing breakdown.
Video vs Other Media: The Moments Trade-Off
| Feature | Moments Cost | What You Get |
|---|---|---|
| Text message | 1-2 | Text response |
| Image (standard) | 25-50 | One static image |
| Short video (3s) | ~50 | Brief motion clip |
| Full video | ~600 | Longer motion clip |
| Voice (per minute) | 100 | Real-time audio |
For 600 Moments — the cost of one full video clip — you could alternatively generate:
- 12-24 static images, OR
- 6 minutes of voice calls, OR
- 12 short video clips
This trade-off is important context. If you're choosing between regular video generation and image variety, the same budget goes significantly further with images. The right balance depends entirely on which type of content you value more.
Tips for Better Video Results
Getting consistent quality from the video generator comes down to a few factors:
Use high-quality source images. The video generator amplifies what's in the source image. A well-composed, clear image with good lighting produces better motion than a lower-quality source.
Keep motion prompts specific and simple. "Slow head turn to the right" produces better results than "dancing while looking at camera and waving." The AI performs best with a single clear motion instruction.
Test with short clips first. At 50 Moments vs 600, generating a short clip to test a prompt before committing to a full-length version saves Moments significantly. If the short clip quality satisfies you, you haven't spent 600 Moments on a longer version you didn't need.
Use the Premium generation model. If you have Premium tier access, the advanced generation model produces measurably better quality output in both images and video. The difference is most visible in facial detail and motion smoothness.
Generate images specifically for video. Images created as source material for video clips benefit from composition choices that work in motion: neutral backgrounds, well-lit faces, natural poses. Images generated for static viewing aren't always optimized for video conversion.
Who Gets Real Value from the Video Generator
Worth the Moments if:
- Visual media from your companion is a primary reason for using the platform
- You want content that goes beyond static images into motion
- You're creating or saving companion media regularly
- Multi-media experience with one character matters more than variety
Not worth the Moments if:
- You're primarily a text-based conversation user
- Your Moments budget is tight and text + images serve your needs
- You're on the free or Lite tier with limited allocation (video depletes it fast)
- You haven't yet established whether the platform's overall quality justifies continued use
For the tier-by-tier video access breakdown, the free vs premium comparison covers which plans unlock which video capabilities.
FAQ
Short clips are approximately 3 seconds. Longer clips are available on Plus tier and above, with the exact length varying based on prompt and generation model. Short clips cost ~50 Moments each; longer clips cost up to ~600 Moments. Shorter clips are adequate for seeing companion movement; longer clips provide a more complete visual experience for complex motion prompts.
No. Video generation is not available on the free tier. It requires Lite plan or above, and additionally requires Moments to cover the generation cost (50-600 Moments per clip). Free accounts can see the video generation interface but cannot use it without a paid subscription.
This depends on your tier and how you split your Moments allocation:
- Lite (1,000 Moments): ~1-2 full clips, ~20 short clips
- Plus (3,000 Moments): ~5 full clips, ~60 short clips
- Premium (8,000 Moments): ~13 full clips, ~160 short clips
- Ultimate (15,000 Moments): ~25 full clips, ~300 short clips
These are maximums if you spent all Moments on video alone. Mixed-use reduces these figures. Additional Moments can be purchased separately from plan upgrades if you exceed your monthly allocation.
Reviewers rate video quality at 4.1/5. Character motion is smooth in most outputs, facial expressions look natural, and the companion's visual identity is preserved throughout the clip. Quality is best on simple, clear motion prompts with high-quality source images. Complex prompts occasionally produce artifacts, particularly in hand and finger motion. Premium generation models produce better output than standard models.