-
Notifications
You must be signed in to change notification settings - Fork 52
Pull requests: vllm-project/vllm-ascend
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Doc] Add Single NPU (Qwen2.5-VL-7B)
documentation
Improvements or additions to documentation
#311
opened Mar 12, 2025 by
xiemingda-1002
Loading…
[Doc] Add the release note for 0.7.3rc1
documentation
Improvements or additions to documentation
#285
opened Mar 10, 2025 by
wangxiyuan
Loading…
[Core] Support the features of prefix cache and chunk prefill
module:core
#282
opened Mar 9, 2025 by
rjg-lyh
Loading…
[Platform] Add get_stream_cls() for platform
module:core
#261
opened Mar 7, 2025 by
shen-shanshan
•
Draft
[Feature] add all_to_all and reduce_scatter
module:core
#256
opened Mar 7, 2025 by
onehaitao
Loading…
[Feature] Graph mode for deepseek.
module:core
module:ops
#254
opened Mar 6, 2025 by
SidaoY
Loading…
[core] Support custom ascendc kernels in vllm-ascend [draft]
module:core
#233
opened Mar 4, 2025 by
ganyi1996ppo
Loading…
[CI]Make UT cases in test_comm_ops.py compatible on Ascend NPU
module:core
#220
opened Mar 3, 2025 by
wwfu109
Loading…
[BugFix]add int8 cache dtype when using attention quantization
module:core
#128
opened Feb 21, 2025 by
Angazenn
Loading…
ProTip!
Updated in the last three days: updated:>2025-03-09.