[subinterpreter] Make type version tag counter threadsafe for per interpreter GIL #95795

kumaraditya303 · 2022-08-08T14:14:51Z

The next_version_tag in typeobject.c is currently not thread safe and relies on GIL to protect against concurrent increments across threads.

cpython/Objects/typeobject.c

Lines 46 to 50 in 63140b4

    
           // bpo-42745: next_version_tag remains shared by all interpreters because of static types 
        
           // Used to set PyTypeObject.tp_version_tag 
        
           static unsigned int next_version_tag = 1; 
        
           typedef struct PySlot_Offset {

For per-interpreter GIL, it must be made thread-safe otherwise the type cache will be affected by race conditions. Static types are not affected because they are immutable so this is a issue for pure Python classes aka heap types. This issue is for discussion of possible implementations.

Possible Solutions:

Make the next_version_tag counter per interpreter, while this may seems to fix the issue but since objects are shared across the interpreters and the type version tag is also used to represent a current state of a type in the specializing interpreter, two interpreters can have same the same tag for different types this won't work as expected.
Make the next_version_tag an atomic counter and increment it atomically using the pyatomic APIs. This is my preferred solution since next_version_tag is only modified when a type is modified so is a rare operation and not a performance issue. Since this is just a counter relaxed ordering can be used here.

cc @ericsnowcurrently

Linked PRs

gh-95795: Move types.next_version_tag to PyInterpreterState #102343

The text was updated successfully, but these errors were encountered:

ericsnowcurrently · 2022-08-09T17:02:18Z

Good catch. I missed this in gh-94673.

ericsnowcurrently · 2022-08-09T17:06:40Z

Oh, I misread that as tp_version_tag. 😄

We should definitely fix next_version_tag.

ericsnowcurrently · 2022-08-09T17:08:02Z

FTR, the comment in the embedded code snippet above refers to gh-86911.

ericsnowcurrently · 2022-08-09T17:10:28Z

@markshannon, what are your thoughts?

ericsnowcurrently · 2022-08-09T17:28:37Z

As for me...

Make the next_version_tag counter per interpreter, while this may seems to fix the issue but since objects are shared across the interpreters and the type version tag is also used to represent a current state of a type in the specializing interpreter, two interpreters can have same the same tag for different types this won't work as expected.

Yeah, it would depend on the specialization semantics.

Also, the reference to static types in the snippet you pasted above implies a special relationship with those objects that we may need to revisit. For static types (especially builtin ones) we don't expect them to change after PyType_Ready() is called. So we wouldn't expect them to ever have tp_version_tag modified and therefore next_version_tag is, in practice, only used for heap types. tp_version_tag is updated when methods get cached but those are cached on each interpreter, so we can use a per-interpreter next_version_tag.

(This does imply that we may need to make tp_version_tag per-interpreter for static builtin types.)

Make the next_version_tag an atomic counter and increment it atomically using the pyatomic APIs. This is my preferred solution since next_version_tag is only modified when a type is modified so is a rare operation and not a performance issue. Since this is just a counter relaxed ordering can be used here.

Yeah, this may make sense. Ideally we could make this per-interpreter but using an atomic seems like a reasonable alternative.

markshannon · 2022-08-09T17:30:57Z

What I would prefer, is for immutable, sharable classes to set their tag during initialization, or even statically.
Then mutable classes to use a per-interpreter tag.
The global version tags should be in range(0, 1<<16) and the per-interpreter tags should use values above that.

We will want the most common types to have pre-defined tags, for rapidly getting type information from version numbers in the higher tier optimizers.

markshannon · 2022-08-09T17:33:05Z

Currently there are under 100 Py_TPFLAGS_IMMUTABLETYPE flagged classes in CPython, so 64k should be enough tags.

ericsnowcurrently · 2022-08-10T17:37:31Z

So we would do the following:

add #define _Py_MAX_GLOBAL_TYPE_VERSION_TAG ((2<<16) - 1) in pycore_typeobject.h
handle static builtin types
- move next_version_tag to _PyRuntimeState.types.next_version_tag
- set it to 1 in the static initializer (in pycore_runtime_init.h)
- in _PyStaticType_InitBuiltin(), set tp_version_tag to the next version tag (and add Py_TPFLAGS_VALID_VERSION_TAG to tp_flags)
- add a check for _Py_Runtime.types.next_version_tag <= _Py_MAX_GLOBAL_TYPE_VERSION_TAG in _PyStaticType_InitBuiltin()
handle all other types
- add PyInterpreterState.types.next_version_tag
- set it to _Py_MAX_GLOBAL_TYPE_VERSION_TAG + 1 in the static initializer (in pycore_runtime_init.h)
- update _PyType_ClearCache() (typeobject.c) to use the per-interpreter variable
- update assign_version_tag() (typeobject.c) to use the per-interpreter variable

At that point we could drop next_version_tag in typeobject.c.

There are a few other things to consider:

add various asserts to ensure code that modifies tp_version_tag never touches static builtin types?
should we treat non-builtin static types the same as the builtin ones?

That last one would require some extra work, e.g. special-casing non-heap types to use _Py_Runtime.types.next_version_tag in PyType_Ready() (and probably drop the code from _PyStaticType_InitBuiltin()), as well as adding a global lock there (or making _Py_Runtime.types.next_version_tag atomic). My inclination is to not bother if we can help it. In the end, non-builtin static types will only exist in extension modules that won't be used in multiple interpreters, so there shouldn't be a need to treat their tp_version_tag specially.

markshannon · 2022-08-11T12:09:49Z

That sounds about right. I'm not familiar with all the static initialization logic, but as long the following assertions always hold, I'm happy:

assert((tp->tp_flags & Py_TPFLAGS_IMMUTABLETYPE) == 0 || 
    (tp->tp_version_tag <= _Py_MAX_GLOBAL_TYPE_VERSION_TAG && tp->tp_version_tag));
assert((tp->tp_flags &Py_TPFLAGS_IMMUTABLETYPE) ||
    (tp->tp_version_tag == 0 || tp->tp_version_tag > _Py_MAX_GLOBAL_TYPE_VERSION_TAG));

I'd give the two next_version_tags different names.

add various asserts to ensure code that modifies tp_version_tag never touches static builtin types?

More asserts is good.

should we treat non-builtin static types the same as the builtin ones?

Whether they are static should irrelevant. But if the are immutable and can be shared, then they must have their tp_version_tag set with a process-wide lock held, and should be before they are visible to any interpreter. If then can't be shared, then no need for special treatment.

Core static types will continue to use the global value. All other types will use the per-interpreter value. They all share the same range, where the global types use values < 2^16 and each interpreter uses values higher than that.

* main: pythongh-103801: Tools/wasm linting and formatting (python#103796) pythongh-103673: Add missing ForkingUnixStreamServer and ForkingUnixDatagramServer socketservers (python#103674) pythongh-95795: Move types.next_version_tag to PyInterpreterState (pythongh-102343)

kumaraditya303 added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters labels Aug 8, 2022

kumaraditya303 added this to Subinterpreters Aug 8, 2022

kumaraditya303 moved this to Todo in Subinterpreters Aug 9, 2022

kumaraditya303 self-assigned this Aug 28, 2022

kumaraditya303 added the 3.12 only security fixes label Aug 28, 2022

kumaraditya303 mentioned this issue Feb 9, 2023

_PyStaticType_Dealloc does not invalidate type version tag #101696

Closed

This was referenced Feb 9, 2023

Improve Interpreter Isolation #100227

Closed

GH-101696: invalidate type version tag in _PyStaticType_Dealloc #101697

Merged

bedevere-bot mentioned this issue Feb 28, 2023

gh-95795: Move types.next_version_tag to PyInterpreterState #102343

Merged

kumaraditya303 closed this as completed Jul 27, 2023

github-project-automation bot moved this from Todo to Done in Subinterpreters Jul 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[subinterpreter] Make type version tag counter threadsafe for per interpreter GIL #95795

[subinterpreter] Make type version tag counter threadsafe for per interpreter GIL #95795

kumaraditya303 commented Aug 8, 2022 •

edited by bedevere-bot

Loading

ericsnowcurrently commented Aug 9, 2022

ericsnowcurrently commented Aug 9, 2022

ericsnowcurrently commented Aug 9, 2022

ericsnowcurrently commented Aug 9, 2022

ericsnowcurrently commented Aug 9, 2022

markshannon commented Aug 9, 2022 •

edited

Loading

markshannon commented Aug 9, 2022 •

edited

Loading

ericsnowcurrently commented Aug 10, 2022

markshannon commented Aug 11, 2022 •

edited

Loading

[subinterpreter] Make type version tag counter threadsafe for per interpreter GIL #95795

[subinterpreter] Make type version tag counter threadsafe for per interpreter GIL #95795

Comments

kumaraditya303 commented Aug 8, 2022 • edited by bedevere-bot Loading

Linked PRs

ericsnowcurrently commented Aug 9, 2022

ericsnowcurrently commented Aug 9, 2022

ericsnowcurrently commented Aug 9, 2022

ericsnowcurrently commented Aug 9, 2022

ericsnowcurrently commented Aug 9, 2022

markshannon commented Aug 9, 2022 • edited Loading

markshannon commented Aug 9, 2022 • edited Loading

ericsnowcurrently commented Aug 10, 2022

markshannon commented Aug 11, 2022 • edited Loading

kumaraditya303 commented Aug 8, 2022 •

edited by bedevere-bot

Loading

markshannon commented Aug 9, 2022 •

edited

Loading

markshannon commented Aug 9, 2022 •

edited

Loading

markshannon commented Aug 11, 2022 •

edited

Loading