Skip to content

Commit 511ca94

Browse files
authored
pythongh-95778: CVE-2020-10735: Prevent DoS by very large int() (python#96499)
Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds. This PR comes fresh from a pile of work done in our private PSRT security response team repo. Signed-off-by: Christian Heimes [Red Hat] <christian@python.org> Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org> Reviews via the private PSRT repo via many others (see the NEWS entry in the PR). <!-- gh-issue-number: pythongh-95778 --> * Issue: pythongh-95778 <!-- /gh-issue-number --> I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#). Much of that text wound up in the Issue. Backports PRs already exist. See the issue for links.
1 parent 656167d commit 511ca94

28 files changed

+803
-20
lines changed

Doc/library/functions.rst

+7
Original file line numberDiff line numberDiff line change
@@ -910,6 +910,13 @@ are always available. They are listed here in alphabetical order.
910910
.. versionchanged:: 3.11
911911
The delegation to :meth:`__trunc__` is deprecated.
912912

913+
.. versionchanged:: 3.12
914+
:class:`int` string inputs and string representations can be limited to
915+
help avoid denial of service attacks. A :exc:`ValueError` is raised when
916+
the limit is exceeded while converting a string *x* to an :class:`int` or
917+
when converting an :class:`int` into a string would exceed the limit.
918+
See the :ref:`integer string conversion length limitation
919+
<int_max_str_digits>` documentation.
913920

914921
.. function:: isinstance(object, classinfo)
915922

Doc/library/json.rst

+11
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,11 @@ is a lightweight data interchange format inspired by
2323
`JavaScript <https://en.wikipedia.org/wiki/JavaScript>`_ object literal syntax
2424
(although it is not a strict subset of JavaScript [#rfc-errata]_ ).
2525

26+
.. warning::
27+
Be cautious when parsing JSON data from untrusted sources. A malicious
28+
JSON string may cause the decoder to consume considerable CPU and memory
29+
resources. Limiting the size of data to be parsed is recommended.
30+
2631
:mod:`json` exposes an API familiar to users of the standard library
2732
:mod:`marshal` and :mod:`pickle` modules.
2833

@@ -253,6 +258,12 @@ Basic Usage
253258
be used to use another datatype or parser for JSON integers
254259
(e.g. :class:`float`).
255260

261+
.. versionchanged:: 3.12
262+
The default *parse_int* of :func:`int` now limits the maximum length of
263+
the integer string via the interpreter's :ref:`integer string
264+
conversion length limitation <int_max_str_digits>` to help avoid denial
265+
of service attacks.
266+
256267
*parse_constant*, if specified, will be called with one of the following
257268
strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``.
258269
This can be used to raise an exception if invalid JSON numbers

Doc/library/stdtypes.rst

+166
Original file line numberDiff line numberDiff line change
@@ -622,6 +622,13 @@ class`. float also has the following additional methods.
622622
:exc:`OverflowError` on infinities and a :exc:`ValueError` on
623623
NaNs.
624624

625+
.. note::
626+
627+
The values returned by ``as_integer_ratio()`` can be huge. Attempts
628+
to render such integers into decimal strings may bump into the
629+
:ref:`integer string conversion length limitation
630+
<int_max_str_digits>`.
631+
625632
.. method:: float.is_integer()
626633

627634
Return ``True`` if the float instance is finite with integral
@@ -5460,6 +5467,165 @@ types, where they are relevant. Some of these are not reported by the
54605467
[<class 'bool'>]
54615468

54625469

5470+
.. _int_max_str_digits:
5471+
5472+
Integer string conversion length limitation
5473+
===========================================
5474+
5475+
CPython has a global limit for converting between :class:`int` and :class:`str`
5476+
to mitigate denial of service attacks. This limit *only* applies to decimal or
5477+
other non-power-of-two number bases. Hexadecimal, octal, and binary conversions
5478+
are unlimited. The limit can be configured.
5479+
5480+
The :class:`int` type in CPython is an abitrary length number stored in binary
5481+
form (commonly known as a "bignum"). There exists no algorithm that can convert
5482+
a string to a binary integer or a binary integer to a string in linear time,
5483+
*unless* the base is a power of 2. Even the best known algorithms for base 10
5484+
have sub-quadratic complexity. Converting a large value such as ``int('1' *
5485+
500_000)`` can take over a second on a fast CPU.
5486+
5487+
Limiting conversion size offers a practical way to avoid `CVE-2020-10735
5488+
<https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735>`_.
5489+
5490+
The limit is applied to the number of digit characters in the input or output
5491+
string when a non-linear conversion algorithm would be involved. Underscores
5492+
and the sign are not counted towards the limit.
5493+
5494+
When an operation would exceed the limit, a :exc:`ValueError` is raised:
5495+
5496+
.. doctest::
5497+
5498+
>>> import sys
5499+
>>> sys.set_int_max_str_digits(4300) # Illustrative, this is the default.
5500+
>>> _ = int('2' * 5432)
5501+
Traceback (most recent call last):
5502+
...
5503+
ValueError: Exceeds the limit (4300) for integer string conversion: value has 5432 digits.
5504+
>>> i = int('2' * 4300)
5505+
>>> len(str(i))
5506+
4300
5507+
>>> i_squared = i*i
5508+
>>> len(str(i_squared))
5509+
Traceback (most recent call last):
5510+
...
5511+
ValueError: Exceeds the limit (4300) for integer string conversion: value has 8599 digits.
5512+
>>> len(hex(i_squared))
5513+
7144
5514+
>>> assert int(hex(i_squared), base=16) == i*i # Hexadecimal is unlimited.
5515+
5516+
The default limit is 4300 digits as provided in
5517+
:data:`sys.int_info.default_max_str_digits <sys.int_info>`.
5518+
The lowest limit that can be configured is 640 digits as provided in
5519+
:data:`sys.int_info.str_digits_check_threshold <sys.int_info>`.
5520+
5521+
Verification:
5522+
5523+
.. doctest::
5524+
5525+
>>> import sys
5526+
>>> assert sys.int_info.default_max_str_digits == 4300, sys.int_info
5527+
>>> assert sys.int_info.str_digits_check_threshold == 640, sys.int_info
5528+
>>> msg = int('578966293710682886880994035146873798396722250538762761564'
5529+
... '9252925514383915483333812743580549779436104706260696366600'
5530+
... '571186405732').to_bytes(53, 'big')
5531+
...
5532+
5533+
.. versionadded:: 3.12
5534+
5535+
Affected APIs
5536+
-------------
5537+
5538+
The limition only applies to potentially slow conversions between :class:`int`
5539+
and :class:`str` or :class:`bytes`:
5540+
5541+
* ``int(string)`` with default base 10.
5542+
* ``int(string, base)`` for all bases that are not a power of 2.
5543+
* ``str(integer)``.
5544+
* ``repr(integer)``
5545+
* any other string conversion to base 10, for example ``f"{integer}"``,
5546+
``"{}".format(integer)``, or ``b"%d" % integer``.
5547+
5548+
The limitations do not apply to functions with a linear algorithm:
5549+
5550+
* ``int(string, base)`` with base 2, 4, 8, 16, or 32.
5551+
* :func:`int.from_bytes` and :func:`int.to_bytes`.
5552+
* :func:`hex`, :func:`oct`, :func:`bin`.
5553+
* :ref:`formatspec` for hex, octal, and binary numbers.
5554+
* :class:`str` to :class:`float`.
5555+
* :class:`str` to :class:`decimal.Decimal`.
5556+
5557+
Configuring the limit
5558+
---------------------
5559+
5560+
Before Python starts up you can use an environment variable or an interpreter
5561+
command line flag to configure the limit:
5562+
5563+
* :envvar:`PYTHONINTMAXSTRDIGITS`, e.g.
5564+
``PYTHONINTMAXSTRDIGITS=640 python3`` to set the limit to 640 or
5565+
``PYTHONINTMAXSTRDIGITS=0 python3`` to disable the limitation.
5566+
* :option:`-X int_max_str_digits <-X>`, e.g.
5567+
``python3 -X int_max_str_digits=640``
5568+
* :data:`sys.flags.int_max_str_digits` contains the value of
5569+
:envvar:`PYTHONINTMAXSTRDIGITS` or :option:`-X int_max_str_digits <-X>`.
5570+
If both the env var and the ``-X`` option are set, the ``-X`` option takes
5571+
precedence. A value of *-1* indicates that both were unset, thus a value of
5572+
:data:`sys.int_info.default_max_str_digits` was used during initilization.
5573+
5574+
From code, you can inspect the current limit and set a new one using these
5575+
:mod:`sys` APIs:
5576+
5577+
* :func:`sys.get_int_max_str_digits` and :func:`sys.set_int_max_str_digits` are
5578+
a getter and setter for the interpreter-wide limit. Subinterpreters have
5579+
their own limit.
5580+
5581+
Information about the default and minimum can be found in :attr:`sys.int_info`:
5582+
5583+
* :data:`sys.int_info.default_max_str_digits <sys.int_info>` is the compiled-in
5584+
default limit.
5585+
* :data:`sys.int_info.str_digits_check_threshold <sys.int_info>` is the lowest
5586+
accepted value for the limit (other than 0 which disables it).
5587+
5588+
.. versionadded:: 3.12
5589+
5590+
.. caution::
5591+
5592+
Setting a low limit *can* lead to problems. While rare, code exists that
5593+
contains integer constants in decimal in their source that exceed the
5594+
minimum threshold. A consequence of setting the limit is that Python source
5595+
code containing decimal integer literals longer than the limit will
5596+
encounter an error during parsing, usually at startup time or import time or
5597+
even at installation time - anytime an up to date ``.pyc`` does not already
5598+
exist for the code. A workaround for source that contains such large
5599+
constants is to convert them to ``0x`` hexadecimal form as it has no limit.
5600+
5601+
Test your application thoroughly if you use a low limit. Ensure your tests
5602+
run with the limit set early via the environment or flag so that it applies
5603+
during startup and even during any installation step that may invoke Python
5604+
to precompile ``.py`` sources to ``.pyc`` files.
5605+
5606+
Recommended configuration
5607+
-------------------------
5608+
5609+
The default :data:`sys.int_info.default_max_str_digits` is expected to be
5610+
reasonable for most applications. If your application requires a different
5611+
limit, set it from your main entry point using Python version agnostic code as
5612+
these APIs were added in security patch releases in versions before 3.12.
5613+
5614+
Example::
5615+
5616+
>>> import sys
5617+
>>> if hasattr(sys, "set_int_max_str_digits"):
5618+
... upper_bound = 68000
5619+
... lower_bound = 4004
5620+
... current_limit = sys.get_int_max_str_digits()
5621+
... if current_limit == 0 or current_limit > upper_bound:
5622+
... sys.set_int_max_str_digits(upper_bound)
5623+
... elif current_limit < lower_bound:
5624+
... sys.set_int_max_str_digits(lower_bound)
5625+
5626+
If you need to disable it entirely, set it to ``0``.
5627+
5628+
54635629
.. rubric:: Footnotes
54645630

54655631
.. [1] Additional information on these special methods may be found in the Python

Doc/library/sys.rst

+44-13
Original file line numberDiff line numberDiff line change
@@ -502,9 +502,9 @@ always available.
502502
The :term:`named tuple` *flags* exposes the status of command line
503503
flags. The attributes are read only.
504504

505-
============================= ================================================================
505+
============================= ==============================================================================================================
506506
attribute flag
507-
============================= ================================================================
507+
============================= ==============================================================================================================
508508
:const:`debug` :option:`-d`
509509
:const:`inspect` :option:`-i`
510510
:const:`interactive` :option:`-i`
@@ -521,7 +521,8 @@ always available.
521521
:const:`dev_mode` :option:`-X dev <-X>` (:ref:`Python Development Mode <devmode>`)
522522
:const:`utf8_mode` :option:`-X utf8 <-X>`
523523
:const:`safe_path` :option:`-P`
524-
============================= ================================================================
524+
:const:`int_max_str_digits` :option:`-X int_max_str_digits <-X>` (:ref:`integer string conversion length limitation <int_max_str_digits>`)
525+
============================= ==============================================================================================================
525526

526527
.. versionchanged:: 3.2
527528
Added ``quiet`` attribute for the new :option:`-q` flag.
@@ -543,6 +544,9 @@ always available.
543544
.. versionchanged:: 3.11
544545
Added the ``safe_path`` attribute for :option:`-P` option.
545546

547+
.. versionchanged:: 3.12
548+
Added the ``int_max_str_digits`` attribute.
549+
546550

547551
.. data:: float_info
548552

@@ -723,6 +727,13 @@ always available.
723727

724728
.. versionadded:: 3.6
725729

730+
.. function:: get_int_max_str_digits()
731+
732+
Returns the current value for the :ref:`integer string conversion length
733+
limitation <int_max_str_digits>`. See also :func:`set_int_max_str_digits`.
734+
735+
.. versionadded:: 3.12
736+
726737
.. function:: getrefcount(object)
727738

728739
Return the reference count of the *object*. The count returned is generally one
@@ -996,19 +1007,31 @@ always available.
9961007

9971008
.. tabularcolumns:: |l|L|
9981009

999-
+-------------------------+----------------------------------------------+
1000-
| Attribute | Explanation |
1001-
+=========================+==============================================+
1002-
| :const:`bits_per_digit` | number of bits held in each digit. Python |
1003-
| | integers are stored internally in base |
1004-
| | ``2**int_info.bits_per_digit`` |
1005-
+-------------------------+----------------------------------------------+
1006-
| :const:`sizeof_digit` | size in bytes of the C type used to |
1007-
| | represent a digit |
1008-
+-------------------------+----------------------------------------------+
1010+
+----------------------------------------+-----------------------------------------------+
1011+
| Attribute | Explanation |
1012+
+========================================+===============================================+
1013+
| :const:`bits_per_digit` | number of bits held in each digit. Python |
1014+
| | integers are stored internally in base |
1015+
| | ``2**int_info.bits_per_digit`` |
1016+
+----------------------------------------+-----------------------------------------------+
1017+
| :const:`sizeof_digit` | size in bytes of the C type used to |
1018+
| | represent a digit |
1019+
+----------------------------------------+-----------------------------------------------+
1020+
| :const:`default_max_str_digits` | default value for |
1021+
| | :func:`sys.get_int_max_str_digits` when it |
1022+
| | is not otherwise explicitly configured. |
1023+
+----------------------------------------+-----------------------------------------------+
1024+
| :const:`str_digits_check_threshold` | minimum non-zero value for |
1025+
| | :func:`sys.set_int_max_str_digits`, |
1026+
| | :envvar:`PYTHONINTMAXSTRDIGITS`, or |
1027+
| | :option:`-X int_max_str_digits <-X>`. |
1028+
+----------------------------------------+-----------------------------------------------+
10091029

10101030
.. versionadded:: 3.1
10111031

1032+
.. versionchanged:: 3.12
1033+
Added ``default_max_str_digits`` and ``str_digits_check_threshold``.
1034+
10121035

10131036
.. data:: __interactivehook__
10141037

@@ -1308,6 +1331,14 @@ always available.
13081331

13091332
.. availability:: Unix.
13101333

1334+
.. function:: set_int_max_str_digits(n)
1335+
1336+
Set the :ref:`integer string conversion length limitation
1337+
<int_max_str_digits>` used by this interpreter. See also
1338+
:func:`get_int_max_str_digits`.
1339+
1340+
.. versionadded:: 3.12
1341+
13111342
.. function:: setprofile(profilefunc)
13121343

13131344
.. index::

Doc/library/test.rst

+10
Original file line numberDiff line numberDiff line change
@@ -1011,6 +1011,16 @@ The :mod:`test.support` module defines the following functions:
10111011
.. versionadded:: 3.10
10121012

10131013

1014+
.. function:: adjust_int_max_str_digits(max_digits)
1015+
1016+
This function returns a context manager that will change the global
1017+
:func:`sys.set_int_max_str_digits` setting for the duration of the
1018+
context to allow execution of test code that needs a different limit
1019+
on the number of digits when converting between an integer and string.
1020+
1021+
.. versionadded:: 3.12
1022+
1023+
10141024
The :mod:`test.support` module defines the following classes:
10151025

10161026

Doc/using/cmdline.rst

+13
Original file line numberDiff line numberDiff line change
@@ -505,6 +505,9 @@ Miscellaneous options
505505
stored in a traceback of a trace. Use ``-X tracemalloc=NFRAME`` to start
506506
tracing with a traceback limit of *NFRAME* frames. See the
507507
:func:`tracemalloc.start` for more information.
508+
* ``-X int_max_str_digits`` configures the :ref:`integer string conversion
509+
length limitation <int_max_str_digits>`. See also
510+
:envvar:`PYTHONINTMAXSTRDIGITS`.
508511
* ``-X importtime`` to show how long each import takes. It shows module
509512
name, cumulative time (including nested imports) and self time (excluding
510513
nested imports). Note that its output may be broken in multi-threaded
@@ -582,6 +585,9 @@ Miscellaneous options
582585
.. versionadded:: 3.11
583586
The ``-X frozen_modules`` option.
584587

588+
.. versionadded:: 3.12
589+
The ``-X int_max_str_digits`` option.
590+
585591
.. versionadded:: 3.12
586592
The ``-X perf`` option.
587593

@@ -763,6 +769,13 @@ conflict.
763769

764770
.. versionadded:: 3.2.3
765771

772+
.. envvar:: PYTHONINTMAXSTRDIGITS
773+
774+
If this variable is set to an integer, it is used to configure the
775+
interpreter's global :ref:`integer string conversion length limitation
776+
<int_max_str_digits>`.
777+
778+
.. versionadded:: 3.12
766779

767780
.. envvar:: PYTHONIOENCODING
768781

Doc/whatsnew/3.12.rst

+11
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,17 @@ Other Language Changes
8383
mapping is hashable.
8484
(Contributed by Serhiy Storchaka in :gh:`87995`.)
8585

86+
* Converting between :class:`int` and :class:`str` in bases other than 2
87+
(binary), 4, 8 (octal), 16 (hexadecimal), or 32 such as base 10 (decimal)
88+
now raises a :exc:`ValueError` if the number of digits in string form is
89+
above a limit to avoid potential denial of service attacks due to the
90+
algorithmic complexity. This is a mitigation for `CVE-2020-10735
91+
<https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735>`_.
92+
This limit can be configured or disabled by environment variable, command
93+
line flag, or :mod:`sys` APIs. See the :ref:`integer string conversion
94+
length limitation <int_max_str_digits>` documentation. The default limit
95+
is 4300 digits in string form.
96+
8697

8798
New Modules
8899
===========

Include/internal/pycore_global_strings.h

+1
Original file line numberDiff line numberDiff line change
@@ -451,6 +451,7 @@ struct _Py_global_strings {
451451
STRUCT_FOR_ID(mapping)
452452
STRUCT_FOR_ID(match)
453453
STRUCT_FOR_ID(max_length)
454+
STRUCT_FOR_ID(maxdigits)
454455
STRUCT_FOR_ID(maxevents)
455456
STRUCT_FOR_ID(maxmem)
456457
STRUCT_FOR_ID(maxsplit)

0 commit comments

Comments
 (0)