Supported Features¶
This table summarize the status of features on Spyre. By default, those features are planned to be developed using vLLM engine V1.
| Feature | Status |
|---|---|
| Chunked Prefill | ✅ |
| Automatic Prefix Caching | ✅ |
| LoRA | ⛔ |
| Speculative Decoding | ⛔ |
| Guided Decoding | ⛔ |
| Enc-dec | ⛔ |
| Multi Modality | ⚠️ |
| LogProbs | ✅ |
| Prompt logProbs | ⛔ |
| Beam search | ✅ |
| Tensor Parallel | ✅ |
| Pipeline Parallel | ⛔ |
| Expert Parallel | ⛔ |
| Data Parallel | ⛔ |
| Prefill Decode Disaggregation | ⛔ |
| Quantization | ⚠️ |
| Sleep Mode | ⛔ |
| Embedding models | ✅ |
- ✅ Fully operational.
- ⚠️ Experimental support.
- 🚧 Under active development.
- 🗓️ Planned.
- ⛔ Not planned or deprecated.