Skip to content

Supported Features

This table summarize the status of features on Spyre. By default, those features are planned to be developed using vLLM engine V1.

Feature Status
Chunked Prefill
Automatic Prefix Caching
LoRA
Speculative Decoding
Guided Decoding
Enc-dec
Multi Modality ⚠️
LogProbs
Prompt logProbs
Beam search
Tensor Parallel
Pipeline Parallel
Expert Parallel
Data Parallel
Prefill Decode Disaggregation
Quantization ⚠️
Sleep Mode
Embedding models
  • ✅ Fully operational.
  • ⚠️ Experimental support.
  • 🚧 Under active development.
  • 🗓️ Planned.
  • ⛔ Not planned or deprecated.