Skip to main content
Z-Image (造相) is a powerful and highly efficient image generation model with 6B parameters, developed by Alibaba’s Tongyi Lab. It uses a Scalable Single-Stream DiT (S3-DiT) architecture where text, visual semantic tokens, and image VAE tokens are concatenated at the sequence level to serve as a unified input stream, maximizing parameter efficiency. Model Variants:
  • 🚀 Z-Image-Turbo – A distilled version that matches or exceeds leading competitors with only 8 NFEs (Number of Function Evaluations). It offers sub-second inference latency on enterprise-grade H800 GPUs and fits within 16GB VRAM consumer devices.
  • 🧱 Z-Image-Base – The non-distilled foundation model for community-driven fine-tuning and custom development.
  • ✍️ Z-Image-Edit – A variant fine-tuned for image editing tasks with impressive instruction-following capabilities.
Model Highlights:
  • Photorealistic Quality: Delivers strong photorealistic image generation while maintaining excellent aesthetic quality
  • Accurate Bilingual Text Rendering: Excels at accurately rendering complex Chinese and English text
  • Prompt Enhancing & Reasoning: Prompt Enhancer empowers the model with reasoning capabilities
  • Sub-second Inference: Achieves fast generation speed on supported hardware
Related Links:

Z-Image-Turbo text-to-image workflow

Download Workflow

Download the Z-Image-Turbo text-to-image workflow JSON file.

Run on ComfyUI Cloud

Run this workflow directly on ComfyUI Cloud.
Make sure your ComfyUI is updated.Workflows in this guide can be found in the Workflow Templates. If you can’t find them in the template, your ComfyUI may be outdated. (Desktop version’s update will delay sometime)If nodes are missing when loading a workflow, possible reasons:
  1. You are not using the latest ComfyUI version (Nightly version)
  2. Some nodes failed to import at startup

Z-Image-Turbo model downloads

qwen_3_4b.safetensors

Text encoder for Z-Image-Turbo.

z_image_turbo_bf16.safetensors

Diffusion model for Z-Image-Turbo.

ae.safetensors

VAE for Z-Image-Turbo.
Z-Image-Turbo Model Storage Location
📂 ComfyUI/
├── 📂 models/
│   ├── 📂 text_encoders/
│   │      └── qwen_3_4b.safetensors
│   ├── 📂 diffusion_models/
│   │      └── z_image_turbo_bf16.safetensors
│   └── 📂 vae/
│          └── ae.safetensors

Z-Image-Turbo Fun Union ControlNet workflow

This workflow uses the Z-Image-Turbo Fun Union ControlNet model to generate images with ControlNet guidance. It applies Canny edge detection to a reference image and uses the ControlNet to guide the generation process.

Download Workflow

Download the Z-Image-Turbo Fun Union ControlNet workflow JSON file.

Additional model for ControlNet

Z-Image-Turbo-Fun-Controlnet-Union.safetensors

ControlNet model patch for Z-Image-Turbo.
Model Storage Location
📂 ComfyUI/
├── 📂 models/
│   ├── 📂 text_encoders/
│   │      └── qwen_3_4b.safetensors
│   ├── 📂 diffusion_models/
│   │      └── z_image_turbo_bf16.safetensors
│   ├── 📂 vae/
│   │      └── ae.safetensors
│   └── 📂 model_patches/
│          └── Z-Image-Turbo-Fun-Controlnet-Union.safetensors