-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Issues: huggingface/text-generation-inference
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Errors on the server when '--num-shard 4' are used in launching the server.
#3140
opened Mar 26, 2025 by
mgujral
1 of 4 tasks
Startup error when deploying TGI with AMD backend on versions
>3.1.0-rocm
#3137
opened Mar 24, 2025 by
andrewrreed
2 of 4 tasks
gemma-3-27b-it runs out of memory during warmup
#3130
opened Mar 20, 2025 by
davidkartchner
2 of 4 tasks
NotImplementedError: Vlm do not work with prefix caching yet
#3110
opened Mar 13, 2025 by
AndriiBihun
2 of 4 tasks
Sharding Error with max_total_tokens and max_input_tokens options in Gemma3-27B-it model
#3104
opened Mar 13, 2025 by
calycekr
[Upstream dependence changes] The behavior about env var in
hf-hub
has changed.
#3088
opened Mar 8, 2025 by
HairlessVillager
Running container rootless does not work anymore
#3082
opened Mar 7, 2025 by
scriptator
2 of 4 tasks
Llama 3.3 70B Weird , gibberish outputs in production setup
#3043
opened Feb 20, 2025 by
andresC98
2 of 4 tasks
TGI metrics don't have model_name label to indicate which model the metrics belong to
wontfix
This will not be worked on
#3026
opened Feb 17, 2025 by
yashaswipiplani
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-02-26.