-
-
Notifications
You must be signed in to change notification settings - Fork 6.2k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Neuron][V1] Experimental end-to-end enablement for neuron v1
documentation
Improvements or additions to documentation
v1
[V1][Bugfix][Spec Decode] Fix incorrect outputs in V1 speculative decoding due to batch indexing
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#14645
opened Mar 11, 2025 by
benchislett
Loading…
Allow dynamic loading of LoRA adapters in a cache dir
frontend
#14634
opened Mar 11, 2025 by
jberkhahn
Loading…
[V1][Core] using cached vocab_size for Structured Outputs
structured-output
v1
#14630
opened Mar 11, 2025 by
aarnphm
Loading…
[Kernel] LoRA - Enable CUDAGraphs for V1
v1
#14626
opened Mar 11, 2025 by
varun-sundar-rabindranath
Loading…
[V1][Core] Support MistralTokenizer for Structured Output
ready
ONLY add when PR is ready to merge/full CI is needed
structured-output
v1
#14625
opened Mar 11, 2025 by
aarnphm
Loading…
[V1] [CI] Enable ONLY add when PR is ready to merge/full CI is needed
v1
v1/entrypoints
ci/build
ready
#14619
opened Mar 11, 2025 by
robertgshaw2-redhat
Loading…
Enforce that TP > 1 is not supported for Mamba2 if Quantization is Enabled.
#14617
opened Mar 11, 2025 by
fabianlim
Loading…
[Kernel] GGUF MoE kernel
ready
ONLY add when PR is ready to merge/full CI is needed
#14613
opened Mar 11, 2025 by
SzymonOzog
Loading…
[Hardware][Intel GPU][WIP] add V1 engine support and Improvements or additions to documentation
v1
chunked_prefill
kernel
ci/build
documentation
#14612
opened Mar 11, 2025 by
jikunshang
•
Draft
[Bugfix] Data parallel example might fail if the users script initializes torch.cuda
documentation
Improvements or additions to documentation
#14598
opened Mar 11, 2025 by
Jackmin801
Loading…
[DO NOT MERGE]Varun/v1 lora kernels tuner
#14594
opened Mar 11, 2025 by
varun-sundar-rabindranath
•
Draft
fix:set use_beam_search false to aviod trace link broken
documentation
Improvements or additions to documentation
#14592
opened Mar 11, 2025 by
RichardoMrMu
Loading…
[Core][V0] Add guidance backend for structured output
ci/build
structured-output
#14589
opened Mar 11, 2025 by
russellb
Loading…
Ray named test
documentation
Improvements or additions to documentation
#14584
opened Mar 10, 2025 by
ruisearch42
•
Draft
[Quantization][FP8] Adding support for fp8 gemm layer input in fp8
#14578
opened Mar 10, 2025 by
gshtras
Loading…
[Bugfix][TPU][V1] Disable
StructuredOutputManager
import on TPU
v1
#14573
opened Mar 10, 2025 by
NickLucche
Loading…
permute/unpermute kernel for moe optimization
ci/build
#14568
opened Mar 10, 2025 by
CalebDu
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.