πŸ“‹ Model Description


language:
  • en
  • fr
  • zh
  • ja
  • ko
  • de
  • es
  • it
  • pt
  • ru
tags:
  • music-generation
  • text-to-music
  • ggml
  • gguf
  • cpp
  • ace-step
  • diffusion
  • flow-matching
license: mit base_model:
  • ACE-Step/Ace-Step1.5

ACE-Step 1.5 GGUF

Pre-quantized GGUF models for acestep.cpp, a portable C++17 implementation of the ACE-Step 1.5 music generation pipeline using GGML.

Text + lyrics in, stereo 48kHz WAV out. Runs on CPU, CUDA, Metal, Vulkan.

Quick start

git clone --recurse-submodules https://github.com/ServeurpersoCom/acestep.cpp
cd acestep.cpp

pip install huggingface_hub
./models.sh # downloads Q8_0 turbo essentials (~7.7 GB)

mkdir build && cd build
cmake .. -DGGML_CUDA=ON
cmake --build . --config Release -j$(nproc)
cd ..

cat > /tmp/request.json << 'EOF'
{
"caption": "Upbeat pop rock with driving guitars and catchy hooks",
"inference_steps": 8,
"shift": 3.0,
"vocal_language": "fr"
}
EOF

LLM: generate lyrics + audio codes

./build/ace-qwen3 \ --request /tmp/request.json \ --model models/acestep-5Hz-lm-4B-Q8_0.gguf

DiT + VAE: synthesize audio

./build/dit-vae \ --request /tmp/request0.json \ --text-encoder models/Qwen3-Embedding-0.6B-Q8_0.gguf \ --dit models/acestep-v15-turbo-Q8_0.gguf \ --vae models/vae-BF16.gguf

Download options

./models.sh                # Q8_0 turbo essentials (~7.7 GB)
./models.sh --all          # every model, every quant (~97 GB)
./models.sh --quant BF16   # full precision
./models.sh --quant Q6_K   # pick a quant
./models.sh --sft          # add SFT DiT variant
./models.sh --shifts       # add shift1/shift3/continuous variants
./models.sh --lm 0.6B      # smaller LM (fast, lower quality)

Or download individual files manually from the files tab.

Available models

Text encoder

FileQuantSize
Qwen3-Embedding-0.6B-BF16.ggufBF161.2 GB
Qwen3-Embedding-0.6B-Q80.ggufQ80748 MB

LM (Qwen3 causal, audio code generation)

FileParamsQuantSize
acestep-5Hz-lm-4B-BF16.gguf4BBF167.9 GB
acestep-5Hz-lm-4B-Q80.gguf4BQ804.2 GB
acestep-5Hz-lm-4B-Q6K.gguf4BQ6K3.3 GB
acestep-5Hz-lm-4B-Q5KM.gguf4BQ5KM2.9 GB
acestep-5Hz-lm-1.7B-BF16.gguf1.7BBF163.5 GB
acestep-5Hz-lm-1.7B-Q80.gguf1.7BQ801.9 GB
acestep-5Hz-lm-0.6B-BF16.gguf0.6BBF161.3 GB
acestep-5Hz-lm-0.6B-Q80.gguf0.6BQ80677 MB
Small LMs (0.6B/1.7B) only have BF16 + Q8_0 (too small for aggressive quantization). The 4B LM does not have Q4KM (breaks audio code generation).

DiT (flow matching diffusion transformer)

Available for all 6 variants: turbo, sft, base, turbo-shift1, turbo-shift3, turbo-continuous.

QuantSize per variant
BF164.5 GB
Q8_02.4 GB
Q6_K1.9 GB
Q5KM1.6 GB
Q4KM1.4 GB
Turbo preset: 8 steps, no CFG. SFT/Base preset: 32-50 steps, CFG 7.0.

VAE

FileSize
vae-BF16.gguf322 MB
Always BF16 (small, bandwidth-bound, quality-critical).

Pipeline

ace-qwen3 (Qwen3 causal LM, 0.6B/1.7B/4B)
  Phase 1 (if needed): CoT generates bpm, keyscale, timesignature, lyrics
  Phase 2: audio codes (5Hz tokens, FSQ vocabulary)
  Both phases batched: N sequences per forward, weights read once
  CFG with dual KV cache per batch element (cond + uncond)
  Output: request0.json .. requestN-1.json

dit-vae
BPE tokenize
Qwen3-Embedding (28L text encoder)
CondEncoder (lyric 8L + timbre 4L + text_proj)
FSQ detokenizer (audio codes -> source latents)
DiT (24L flow matching, Euler steps)
VAE (AutoencoderOobleck, tiled decode)
WAV stereo 48kHz

Both stages support batching (--batch N) for parallel generation.
LM batching produces different songs, DiT batching produces subtle
variations of the same piece (different initial noise).

Acknowledgements

Independent C++/GGML implementation based on
ACE-Step 1.5 by ACE Studio
and StepFun. All model weights are theirs, this is a native inference backend.

Links

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
Qwen3-Embedding-0.6B-BF16.gguf
LFS FP16
1.11 GB Download
Qwen3-Embedding-0.6B-Q8_0.gguf
LFS Q8
747.82 MB Download
acestep-5Hz-lm-0.6B-BF16.gguf
LFS FP16
1.24 GB Download
acestep-5Hz-lm-0.6B-Q8_0.gguf
LFS Q8
676.96 MB Download
acestep-5Hz-lm-1.7B-BF16.gguf
LFS FP16
3.46 GB Download
acestep-5Hz-lm-1.7B-Q8_0.gguf
LFS Q8
1.84 GB Download
acestep-5Hz-lm-4B-BF16.gguf
LFS FP16
7.81 GB Download
acestep-5Hz-lm-4B-Q5_K_M.gguf
LFS Q5
2.82 GB Download
acestep-5Hz-lm-4B-Q6_K.gguf
LFS Q6
3.21 GB Download
acestep-5Hz-lm-4B-Q8_0.gguf
LFS Q8
4.15 GB Download
acestep-v15-base-BF16.gguf
LFS FP16
4.46 GB Download
acestep-v15-base-Q4_K_M.gguf
Recommended LFS Q4
1.35 GB Download
acestep-v15-base-Q5_K_M.gguf
LFS Q5
1.58 GB Download
acestep-v15-base-Q6_K.gguf
LFS Q6
1.84 GB Download
acestep-v15-base-Q8_0.gguf
LFS Q8
2.37 GB Download
acestep-v15-sft-BF16.gguf
LFS FP16
4.46 GB Download
acestep-v15-sft-Q4_K_M.gguf
LFS Q4
1.35 GB Download
acestep-v15-sft-Q5_K_M.gguf
LFS Q5
1.58 GB Download
acestep-v15-sft-Q6_K.gguf
LFS Q6
1.84 GB Download
acestep-v15-sft-Q8_0.gguf
LFS Q8
2.37 GB Download
acestep-v15-turbo-BF16.gguf
LFS FP16
4.46 GB Download
acestep-v15-turbo-Q4_K_M.gguf
LFS Q4
1.35 GB Download
acestep-v15-turbo-Q5_K_M.gguf
LFS Q5
1.58 GB Download
acestep-v15-turbo-Q6_K.gguf
LFS Q6
1.84 GB Download
acestep-v15-turbo-Q8_0.gguf
LFS Q8
2.37 GB Download
acestep-v15-turbo-continuous-BF16.gguf
LFS FP16
4.46 GB Download
acestep-v15-turbo-continuous-Q4_K_M.gguf
LFS Q4
1.35 GB Download
acestep-v15-turbo-continuous-Q5_K_M.gguf
LFS Q5
1.58 GB Download
acestep-v15-turbo-continuous-Q6_K.gguf
LFS Q6
1.84 GB Download
acestep-v15-turbo-continuous-Q8_0.gguf
LFS Q8
2.37 GB Download
acestep-v15-turbo-shift1-BF16.gguf
LFS FP16
4.46 GB Download
acestep-v15-turbo-shift1-Q4_K_M.gguf
LFS Q4
1.35 GB Download
acestep-v15-turbo-shift1-Q5_K_M.gguf
LFS Q5
1.58 GB Download
acestep-v15-turbo-shift1-Q6_K.gguf
LFS Q6
1.84 GB Download
acestep-v15-turbo-shift1-Q8_0.gguf
LFS Q8
2.37 GB Download
acestep-v15-turbo-shift3-BF16.gguf
LFS FP16
4.46 GB Download
acestep-v15-turbo-shift3-Q4_K_M.gguf
LFS Q4
1.35 GB Download
acestep-v15-turbo-shift3-Q5_K_M.gguf
LFS Q5
1.58 GB Download
acestep-v15-turbo-shift3-Q6_K.gguf
LFS Q6
1.84 GB Download
acestep-v15-turbo-shift3-Q8_0.gguf
LFS Q8
2.37 GB Download
vae-BF16.gguf
LFS FP16
321.79 MB Download