You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This document includes a list of issues / feature requests that we have collected across the oss and other channels. We’ll update this list with relevant info from issues, etc as we go. If there are any features that are not prioritized here, please feel free to open an RFC or feature request or post on the slack community channel.
Core features
Serve
[P0] Prefix aware router
[P0] In-place update for deployments when you have new models without having re-deploy the cluster
[P1] Open router protocol api for devs / researchers
[P1] Prefill disaggregation PxDy pattern (many open questions around this architecture so would be interesting to see under what conditions is this architecture better than simple chunked-prefill enabled replicas under the same resource count)
[P1] Heterogenous accelerator_type (Have a single deployment that can be scheduled with different engine settings on different accelerator types and shapes with different priorities)
[P2] More backends other than vLLM (e.g. sglang) cc @Qiaolin-Yu
[P2] Fractional gpu support
Data
[P0] Heterogenous accelerator_type in the same pipeline
[P0] Multi-node TP+PP for large DeepSeek models
[P1] More vision language models
[P1] TPU support
[P2] More backends other than vLLM (e.g. sglang)
CI/CD and release pipeline
[P0] More release tests on data pipelines
[P0] For Serve release tests use gen-config on the critical path
Docs and community support
[P0] Cover gen-config in serve docs
[P0] Run doc-test on examples
[P0] Update vllm docs with ray cluster setup guide and serve and data code examples
[P1] Example of running deepseek R1 (huge model with ray serve multi node)
The text was updated successfully, but these errors were encountered:
kouroshHakha
changed the title
[Q2 Roadmap][ray.llm] Tentative roadmap for Data and Serve LLM APIs
[Roadmap][ray.llm] Tentative roadmap for Data and Serve LLM APIs
Mar 12, 2025
This document includes a list of issues / feature requests that we have collected across the oss and other channels. We’ll update this list with relevant info from issues, etc as we go. If there are any features that are not prioritized here, please feel free to open an RFC or feature request or post on the slack community channel.
Core features
Serve
Data
CI/CD and release pipeline
Docs and community support
The text was updated successfully, but these errors were encountered: