Diego Mendoza
Elite
Uni-1 is Luma’s new multimodal reasoning model that also does high-end image generation and editing, aimed at more “instruction-following” image work than traditional diffusion models.[1]
What Uni-1 is
- Described by Luma as a multimodal reasoning model that can “generate pixels,” built on their “Unified Intelligence” architecture.
- Technically a decoder-only autoregressive transformer that treats text and images as a single interleaved token sequence, so it reasons and then renders rather than just sampling from noise.
- It combines understanding and generation in one model: the same network parses instructions, plans layout, and produces the final image.
Key capabilities
- Intelligent: Common‑sense scene completion, spatial reasoning (left/right, behind/under), and plausibility‑driven transformations for edits and compositions.
- Directable: Strong reference-guided generation (multi‑image refs, identity preservation, pose/composition transfer, sketch‑to‑art, iterative refinement over multiple turns).
- Cultured: Trained to be “culture‑aware” across aesthetics, memes, manga, etc., so it can follow quite niche visual styles and references.
- Handles text-to-image, image-to-image, multi-reference character sheets, complex edits, and style transfers, with reasoning happening explicitly before/during generation.
Quality and benchmarks
- Luma reports Uni-1 ranks first in human preference Elo for Overall, Style & Editing, and Reference-Based Generation, and second in pure Text-to-Image.
- On reasoning-heavy benchmarks like RISEBench and ODinW‑13, it leads or matches top models (Flux Max, Gemini / Nano Banana, GPT Image 1.5), especially on logic- and constraint-based image tasks.
- In practice this should reduce typical failures like wrong number of objects, incorrect spatial relations, or broken identities across a series.
Access
Try Uni-1 for free on Luma’s site:
You do not have permission to view the full content of this post. Log in or register now.
Overall, Uni-1 is worth watching as a pivot point from “pretty pictures” to genuinely instruction-following, reasoning-first visual models: it unifies understanding and generation in one autoregressive stack, nails logic-heavy benchmarks, and lands at very aggressive 2K pricing that makes it viable not just for hero shots but for iterative, agentic workflows where the model thinks with you across many steps.
Examples
Uni-1
Nano Banana Pro
Prompt 1
Code:
candid medium-close shot of a teenage skateboarder cruising down a quiet 1970s suburban street, captured on 35mm Kodachrome film. The subject rides a vintage wooden skateboard with clay wheels, knees slightly bent, relaxed posture. He wears faded denim cutoffs, a ringer t-shirt, Vans-style sneakers, and has shaggy, sunlit hair.
The background softly falls out of focus with a natural 50mm perspective: single-story homes, parked station wagons, cracked sidewalks, and telephone poles stretching into the distance. Golden hour sunlight casts long shadows and a warm amber glow across the scene.
Technical and Film Characteristics (Crucial):
Authentic Kodachrome rendering—deep blue skies, warm highlights, rich earthy tones, and balanced skin tones. Subtle halation around bright areas, slight lens softness, and natural falloff. Fine grain visible. Film wear includes dust specks, light scratches, and faint edge fading.
Shot on a Pentax Spotmatic F with Super-Takumar 55mm f/1.8 lens.
Uni-1
Nano Banana Pro
Prompt 2
Code:
Create a highly photorealistic image of a professional Chelsea football player executing a sliding tackle during a live match. The image is defined by the following optical stack: shot on a full-frame DSLR (Canon EOS-1D X Mark III), using a 24mm wide-angle lens at f/4, recorded on a high-speed digital sensor with color science resembling Kodak Portra 400.
The scene is illuminated by bright stadium floodlights during an evening match, producing high-contrast lighting with cool highlights and subtle warm tones from the pitch reflections. The setting is a professional football stadium, viewed from a bird’s-eye perspective directly above the action, showing the green pitch with visible grass texture, pitch markings, and scattered turf fragments.
The subject is mid-slide tackle, body extended with one leg reaching toward the ball, cleats kicking up dirt and grass, arms slightly raised for balance, facial E×ρréššion focused and intense. The player is positioned slightly off-center in the frame, with the ball and opposing player partially visible to add context and motion.
The image must contain authentic real-world photographic imperfections such as:
- subtle lens distortion from the wide-angle perspective
- natural digital grain and slight noise in shadow areas
- motion blur on the legs and ball due to fast action
- realistic depth-of-field falloff from the elevated angle
- micro-details like grass blades, dirt particles, and fabric texture
- accurate shadow casting under stadium lighting
- mild chromatic aberration near the edges
Human subject details must include:
- realistic skin tones under stadium lighting with accurate subsurface scattering
- visible sweat, skin pores, and slight imperfections
- natural muscle tension and anatomical accuracy
- authentic fabric stretch and wrinkles in the kit
- minor scuffs and dirt stains on socks and jersey
Environmental details should include:
- textured grass with divots and torn patches from the slide
- stadium seating blurred in the background
- subtle atmospheric haze under floodlights
- realistic reflections on slightly damp grass
Colors must remain physically accurate with balanced white levels and natural sports broadcast-style grading.
Ensure accurate:
- anatomy and motion physics
- fabric dynamics and turf interaction
- lighting falloff and shadow direction
- perspective compression from the elevated wide lens
Camera behavior must simulate real optics including:
- wide-angle spatial distortion consistent with overhead framing
- correct parallax from elevated viewpoint
- slightly imperfect framing as if captured mid-action by a remote stadium camera
Include natural imperfections like:
- dirt spray mid-air
- uneven grass displacement
- slight motion softness in fast-moving elements
- minor inconsistencies in player posture
The final image should be indistinguishable from a real sports photograph captured during a professional football match and must obey real-world physics and visual logic.
Uni-1
Nano Banana Pro
Prompt 3
Code:
Create a highly photorealistic image of a World War II infantry soldier wading through shallow ocean water during a beach landing assault. The image is defined by the following optical stack: shot on a 35mm full-frame DSLR (Canon EOS-1V style film simulation), using a 35mm wide-angle lens at f/4, recorded on Kodak Portra 400 film stock.
The scene is illuminated by overcast, diffused morning light with a cool, desaturated color temperature, softened by heavy sea mist and battlefield smoke. It takes place on a war-torn beach with a massive military landing ship looming in the background, partially obscured by fog, and numerous soldiers advancing through water and surf behind the main subject.
The subject is a সৈ soldier holding a riflë raised above his head to keep it dry, with a tense, focused E×ρréššion, water reaching his chest as he pushes forward. He is positioned centrally in the frame at eye level, creating an immersive, documentary-style perspective with slight forward motion.
The image must contain authentic real-world photographic imperfections such as:
- subtle lens distortion from the wide-angle lens
- natural film grain consistent with Kodak Portra 400
- water droplets and splashes interacting with light
- realistic motion blur in the water surface
- atmospheric haze and diffusion from mist and smoke
- mild chromatic aberration near edges
Human details must include:
- wet fabric clinging to the uniform with visible folds and weight
- natural skin tones with subdued, cold color grading
- visible skin pores, slight dirt, and water sheen on the face
- asymmetrical facial tension and micro-E×ρréššions
Environmental details should include:
- dark, reflective water with ripples and foam
- debris and wooden obstacles scattered in the surf
- silhouettes of other soldiers partially obscured by fog
- volumetric light subtly diffused through mist
Colors must remain muted and realistic with a cinematic war-documentary grading, emphasizing greens, grays, and browns.
Ensure accurate:
- WWII-era gear and riflë proportions
- water physics and splash interaction
- realistic perspective and depth compression
- natural framing with slight imperfection and urgency
Include natural imperfections like:
- water streaks on the lens
- uneven wet textures on clothing
- minor film exposure inconsistencies
The final image should be indistinguishable from a real war photograph captured by a frontline combat photographer, grounded in physical realism and historical authenticity.
Your feedback is highly appreciated
Support my other posts

- Google just KILLED Photoshop!
- 50 Brilliant Ways to Supercharge Creativity with Nano Banana
- Nano Banana Prompt Gallery
- AI Fashion Studio: AI Virtual Try-On Powered By Nano Banana
- Free Image Upscaler up to 16K Quality!
- Travel the World with Nano Banana
- Nano Banana Polaroid Trend
- AI Profile Picture Generator
- AI Snapshot Generator
- ᑕᕼᗩTGᑭT Prompt Packs
- Perplexity at Work
- Free AI Image Editor
- DumPDF: PDF Editor
- LuxPDF: Open Source PDF Tools
- Affinity Studio: Free, Powerful Design Tool
- Gemini Edu ID Card Generator
- CanVâ Education Invite Link 2
- Create UNCENS0RED/NSFW AI Characters
- Student ID Card Prompt
- Introducing Nano Banana Pro
- Nano Banana Pro Image And Prompt Gallery
- Create 4K Nano Banana Pro Images
- Create Pro-Grade Infographics
- IHatePDF: Toolkit For Everyday Documents
- OpenClaw: An AI Agent That Actually Does Things
- Stunning Nano Banana Prompts Gallery
- Lyria 3: Google's AI Music Studio
- Meet Gemini 3.1 Pro
- Create City Map Posters
- Seedream 5.0 Lite: A Smart, Web-Aware AI Image Model
- Nano Banana 2: ProLevel Image Generation at Flash Speed
- Meet Gemini 3.1 Flash‑Lite: Google’s New High‑Throughput AI Workhorse
- GPT‑5.3 Instant: Smarter, Faster Everyday Chat
- GPT‑5.4: OpenAI’s New Flagship GPT‑5‑Series Model
- Inside MAI‑Image‑2