vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 3.8k
Star 26.2k

Code
Issues 1.5k
Pull requests 403
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 49 Milestones 0

New pull request New

403 Open 3,203 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Bugfix] Fix InternVL2 vision embeddings process with pipeline parallel

#8299 opened Sep 9, 2024 by Isotr0py

Loading…

[Model] support minicpm3

#8297 opened Sep 9, 2024 by SUDA-HLT-ywfang

Loading…

[Misc] Added num_cumulative_preemption metrics

#8294 opened Sep 9, 2024 by zeroorhero

Loading…

[Bugfix] Fix weight loading issue by rename variable. ready

ONLY add when PR is ready to merge/full CI is needed

#8293 opened Sep 9, 2024 by wenxcs

Loading…

[Bugfix] Correct adapter usage for cohere and jamba ready

ONLY add when PR is ready to merge/full CI is needed

#8292 opened Sep 9, 2024 by vladislavkruglikov

Loading…

[Bugfix] Mapping physical device indices for e2e test utils

#8290 opened Sep 9, 2024 by ShangmingCai

Loading…

[Bugfix] Fix LongRoPE bug

#8254 opened Sep 7, 2024 by garg-amit

Loading…

[not-for-review] test PR multi py ver ready

ONLY add when PR is ready to merge/full CI is needed

#8253 opened Sep 6, 2024 by khluu

Loading…

[Frontend][Core] Move guided decoding params into sampling params

#8252 opened Sep 6, 2024 by joerunde • Draft

[Bugfix][Frontend] Update all fastapi requests based on OpenAPIBase with annotations

#8251 opened Sep 6, 2024 by drikster80 • Draft

[BugFix] Propagate 'trust_remote_code' setting in internvl and minicpmv

#8250 opened Sep 6, 2024 by zifeitong

Loading…

[Model] Support multiple images for qwen-vl

#8247 opened Sep 6, 2024 by alex-jw-brooks

Loading…

[Kernel] Build flash-attn from source

#8245 opened Sep 6, 2024 by ProExpertProg

Loading…

[Core] support LoRA and prompt adapter in content-based hashing for Block Manager v2 prefix caching

#8240 opened Sep 6, 2024 by llsj14

Loading…

[BugFix] Fix metrics error for --num-scheduler-steps > 1 ready

ONLY add when PR is ready to merge/full CI is needed

#8234 opened Sep 6, 2024 by yuleil

Loading…

[Spec Decode] Move ops.advance_step to flash attn advance_step ready

ONLY add when PR is ready to merge/full CI is needed

#8224 opened Sep 6, 2024 by kevin314

Loading…

[Misc] Fused MoE Marlin support for GPTQ ready

ONLY add when PR is ready to merge/full CI is needed

#8217 opened Sep 5, 2024 by dsikka

Loading…

Add VLLM_LOGGING_INTERVAL_SEC envvar to control logging rate

#8213 opened Sep 5, 2024 by mgoin • Draft

[Misc] Upgrade vllm-flash-attn to v2.6.2 ready

ONLY add when PR is ready to merge/full CI is needed

#8211 opened Sep 5, 2024 by WoosukKwon

Loading…

Fix shutdown problem

#8209 opened Sep 5, 2024 by Bye-legumes

Loading…

[Model] Adding Granite MoE. ready

ONLY add when PR is ready to merge/full CI is needed

#8206 opened Sep 5, 2024 by shawntan

Loading…

Reshape cache to be XQA kernel compatible

#8200 opened Sep 5, 2024 by wenscarl

Loading…

[Core] *Prompt* logprobs support in Multi-step

#8199 opened Sep 5, 2024 by afeldman-nm • Draft

[OpenVINO] Enable GPU support for OpenVINO vLLM backend Intel GPU

#8192 opened Sep 5, 2024 by sshlyapn

Loading…

[Benchmark] Add block_size option to benchmark_throughput.py

#8175 opened Sep 5, 2024 by liangfu

Loading…

Previous 1 2 3 4 5 … 16 17 Next

Previous Next

ProTip! Filter pull requests by the default branch with base:main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly