huggingface / text-generation-inference Public

Notifications You must be signed in to change notification settings
Fork 1.2k
Star 9.9k

Code
Issues 225
Pull requests 21
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: huggingface/text-generation-inference

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

225 Open 1,255 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Errors on the server when '--num-shard 4' are used in launching the server.

#3140 opened Mar 26, 2025 by mgujral

1 of 4 tasks

Deepseek R1 fails to start on Gaudi 2

#3139 opened Mar 26, 2025 by danielfleischer

2 of 4 tasks

Startup error when deploying TGI with AMD backend on versions >3.1.0-rocm

#3137 opened Mar 24, 2025 by andrewrreed

2 of 4 tasks

When using --quantize fp8 model hangs and does not response at all with "ERROR: Arch conditional MMA instruction used without targeting appropriate compute capability. Aborting."

#3135 opened Mar 24, 2025 by tamastarjanyi

1 of 4 tasks

Support for Mistral Small 3.1

#3133 opened Mar 22, 2025 by meetzuber

1 of 2 tasks

gemma-3-27b-it runs out of memory during warmup

#3130 opened Mar 20, 2025 by davidkartchner

2 of 4 tasks

Support for priority based queueing in the backend queue

#3123 opened Mar 18, 2025 by ziadmoubayed

does tgi support Gemma 3 models?

#3115 opened Mar 17, 2025 by Behnamhb

2 tasks

Tool use does not work on Neuron backend

#3114 opened Mar 14, 2025 by LouisHernandez17

3 of 4 tasks

NotImplementedError: Vlm do not work with prefix caching yet

#3110 opened Mar 13, 2025 by AndriiBihun

2 of 4 tasks

google/gemma-3-27b-it context lenght issue

#3105 opened Mar 13, 2025 by nskpro-cmd

Sharding Error with max_total_tokens and max_input_tokens options in Gemma3-27B-it model

#3104 opened Mar 13, 2025 by calycekr

Moondream2 | TGI Model support | Intel GPU

#3102 opened Mar 12, 2025 by rskasturi

Multi-node inference

#3097 opened Mar 11, 2025 by hrbigelow

[Upstream dependence changes] The behavior about env var in hf-hub has changed.

#3088 opened Mar 8, 2025 by HairlessVillager

Running container rootless does not work anymore

#3082 opened Mar 7, 2025 by scriptator

2 of 4 tasks

Add support for phi-4-mini and phi-4-multimodal

#3071 opened Mar 5, 2025 by farzanehnakhaee70

Do not force port 9000 for prometheus

#3062 opened Feb 26, 2025 by AndreasMadsen

Support for RISC-V?

#3059 opened Feb 26, 2025 by JocelynPanPan

Adapt the response_format closer to OpenAIs format

#3058 opened Feb 25, 2025 by jorado

Inexplicable 'incomplete generation' error

#3050 opened Feb 21, 2025 by mwm5945

2 of 4 tasks

Llama 3.3 70B Weird , gibberish outputs in production setup

#3043 opened Feb 20, 2025 by andresC98

2 of 4 tasks

VRAM usage increases in version 3.1.0

#3038 opened Feb 19, 2025 by aW3st

2 of 4 tasks

TGI metrics don't have model_name label to indicate which model the metrics belong to wontfix

This will not be worked on

#3026 opened Feb 17, 2025 by yashaswipiplani

Unsupported model type xlm-roberta

#3020 opened Feb 13, 2025 by elvizlai

2 of 4 tasks

Previous 1 2 3 4 5 … 8 9 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2025-02-26.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly