Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix][TPU][V1] Disable StructuredOutputManager import on TPU #14573

Closed

Conversation

NickLucche
Copy link
Contributor

@NickLucche NickLucche commented Mar 10, 2025

V1 Engine on TPU is currently broken as StructuredOutputManager appears to have a hard dependency on triton, failing on server startup with:

INFO 03-10 17:21:34 [core.py:120] init engine (profile, create kv cache, warmup model) took 49.99 seconds
ERROR 03-10 17:21:34 [core.py:324] EngineCore hit an exception: Traceback (most recent call last):
ERROR 03-10 17:21:34 [core.py:324]   File "/home/nick/vllm/vllm/v1/engine/core.py", line 316, in run_engine_core
ERROR 03-10 17:21:34 [core.py:324]     engine_core = EngineCoreProc(*args, **kwargs)
ERROR 03-10 17:21:34 [core.py:324]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-10 17:21:34 [core.py:324]   File "/home/nick/vllm/vllm/v1/engine/core.py", line 271, in __init__
ERROR 03-10 17:21:34 [core.py:324]     super().__init__(vllm_config, executor_class, log_stats)
ERROR 03-10 17:21:34 [core.py:324]   File "/home/nick/vllm/vllm/v1/engine/core.py", line 65, in __init__
ERROR 03-10 17:21:34 [core.py:324]     self.structured_output_manager = StructuredOutputManager(vllm_config)
ERROR 03-10 17:21:34 [core.py:324]                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-10 17:21:34 [core.py:324]   File "/home/nick/vllm/vllm/v1/structured_output/__init__.py", line 42, in __init__
ERROR 03-10 17:21:34 [core.py:324]     tokenizer_info = xgr.TokenizerInfo.from_huggingface(
ERROR 03-10 17:21:34 [core.py:324]                      ^^^^^^^^^^^^^^^^^
ERROR 03-10 17:21:34 [core.py:324]   File "/home/nick/vllm/vllm/utils.py", line 2357, in __getattr__
ERROR 03-10 17:21:34 [core.py:324]     self._module = self._load()
ERROR 03-10 17:21:34 [core.py:324]                    ^^^^^^^^^^^^
ERROR 03-10 17:21:34 [core.py:324]   File "/home/nick/vllm/vllm/utils.py", line 2347, in _load
ERROR 03-10 17:21:34 [core.py:324]     raise err from None
ERROR 03-10 17:21:34 [core.py:324]   File "/home/nick/vllm/vllm/utils.py", line 2341, in _load
ERROR 03-10 17:21:34 [core.py:324]     module = importlib.import_module(self.__name__)
ERROR 03-10 17:21:34 [core.py:324]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-10 17:21:34 [core.py:324]   File "/home/nick/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/importlib/__init__.py", line 126, in import_module
ERROR 03-10 17:21:34 [core.py:324]     return _bootstrap._gcd_import(name[level:], package, level)
ERROR 03-10 17:21:34 [core.py:324]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 03-10 17:21:34 [core.py:324]   File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
ERROR 03-10 17:21:34 [core.py:324]   File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
ERROR 03-10 17:21:34 [core.py:324]   File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
ERROR 03-10 17:21:34 [core.py:324]   File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
ERROR 03-10 17:21:34 [core.py:324]   File "<frozen importlib._bootstrap_external>", line 940, in exec_module
ERROR 03-10 17:21:34 [core.py:324]   File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
ERROR 03-10 17:21:34 [core.py:324]   File "/home/nick/vllm/.venv/lib/python3.11/site-packages/xgrammar/__init__.py", line 1, in <module>
ERROR 03-10 17:21:34 [core.py:324]     from . import testing
ERROR 03-10 17:21:34 [core.py:324]   File "/home/nick/vllm/.venv/lib/python3.11/site-packages/xgrammar/testing.py", line 11, in <module>
ERROR 03-10 17:21:34 [core.py:324]     from .matcher import GrammarMatcher, bitmask_dtype, get_bitmask_shape
ERROR 03-10 17:21:34 [core.py:324]   File "/home/nick/vllm/.venv/lib/python3.11/site-packages/xgrammar/matcher.py", line 11, in <module>
ERROR 03-10 17:21:34 [core.py:324]     from .kernels import apply_token_bitmask_inplace_cpu, apply_token_bitmask_inplace_triton
ERROR 03-10 17:21:34 [core.py:324]   File "/home/nick/vllm/.venv/lib/python3.11/site-packages/xgrammar/kernels/__init__.py", line 4, in <module>
ERROR 03-10 17:21:34 [core.py:324]     from .apply_token_bitmask_inplace_triton import apply_token_bitmask_inplace_triton
ERROR 03-10 17:21:34 [core.py:324]   File "/home/nick/vllm/.venv/lib/python3.11/site-packages/xgrammar/kernels/apply_token_bitmask_inplace_triton.py", line 4, in <module>
ERROR 03-10 17:21:34 [core.py:324]     import triton
ERROR 03-10 17:21:34 [core.py:324] ModuleNotFoundError: No module named 'triton'

Open to any way to more gracefully solve this (quickly), especially with an impl that makes mypy happy (current one does not).

Signed-off-by: NickLucche <nlucches@redhat.com>
@NickLucche NickLucche changed the title [Bugfix][TPU][V1] disable structuredoutputmanager on tpu [Bugfix][TPU][V1]Ddisable StructuredOutputManager import on TPU Mar 10, 2025
@mergify mergify bot added the v1 label Mar 10, 2025
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: NickLucche <nlucches@redhat.com>
@NickLucche NickLucche changed the title [Bugfix][TPU][V1]Ddisable StructuredOutputManager import on TPU [Bugfix][TPU][V1] Disable StructuredOutputManager import on TPU Mar 10, 2025
Comment on lines +39 to +45
class StructuredOutputManager:

def __init__(self, *args, **kwargs):
pass

def __getattr__(self, name):
return lambda *args, **kwargs: None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really hate this cheap workaround. Is anyone working on properly addressing this @njhill @russellb ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll push something now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an alternative: #14575

@robertgshaw2-redhat
Copy link
Collaborator

Why not just install triton?

@russellb
Copy link
Member

Why not just install triton?

The triton README doesn't list it as supported, so unless that's wrong, that doesn't sound ideal.

My PR here will handle this more cleanly, assuming we don't expect this feature to work on TPU right now. #14575

@NickLucche NickLucche closed this Mar 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants