gh-53584: Prevent variable-nargs options from stealing required positional args by stephenfin · Pull Request #146513 · python/cpython

stephenfin · 2026-03-27T12:06:01Z

argparse supports options with variable numbers of args. gh-53584 tracks a long-standing bug where options with a non-fixed number of args (nargs='?', nargs='*', or nargs='+') will greedily consume argument strings that should have been reserved for required positional arguments. For example:

parser.add_argument('--foo', nargs='?')
parser.add_argument('bar')
parser.parse_args(['--foo', 'abc'])  # bar got nothing

argparse works by pattern-matching, with the bulk of its logic defined in _parse_known_args(). That method encodes each argument string as a single character ('O' for a recognised option flag, 'A' for everything else, and '-' for strings following --) which gives us a resulting string pattern (e.g. 'OAOA'). We then "consume" arguments using this string pattern, handling both positionals (via the consume_positionals() closure) and optionals (via consume_optional()).

The bug arises in the latter. consume_optional() processes options by building a regex based on the option's nargs (e.g. '-*A?-*' for nargs='?') and then runs this regex against the remainder of the string pattern (i.e. anything following the option flag). This is done via _match_argument(). The regex will always stop at the next option flag ('O' token) but for non-fixed nargs values like '?' and '+' may greedily consume every positional ('A' token) up to that point.

This fix works by manipulating the string pattern as part of our optional consumption. Any 'A' tokens that are required by remaining positionals are masked to 'O' to prevent the regex consuming them. Masking will only consider tokens up to the next option flag ('O') and it both accounts for what future options can absorb (to avoid masking more than is necessary) and ensures that at least the minimum arguments required for the optional are actually consumed.

In addition, we also handle the parse_intermixed_args() case. In intermixed mode, positional-typed arguments already collected to the left of the current option are also accounted for, since they will satisfy positionals in the second-pass parse.

Note that this is a rather gnarly issue, and I've done my best to avoid changing the API behavior of the module without layering on too much additional complexity, in the hope that this might actually be backportable. Hopefully my proposed approach is sound but I'm happy to iterate on this if there's something I've missed or there is a better way to do this.

Most of these are marked as expected fail for now, pending the fix. Signed-off-by: Stephen Finucane <[email protected]>

… positional args Options with nargs='?', nargs='*', or nargs='+' were greedily consuming argument strings that should have been reserved for required positional arguments. For example: parser.add_argument('--foo', nargs='?') parser.add_argument('bar') parser.parse_args(['--foo', 'abc']) # bar got nothing argparse works by pattern-matching, with the bulk of its logic defined in _parse_known_args(). That method encodes each argument string as a single character ('O' for a recognised option flag, 'A' for everything else, and '-' for strings following '--') which gives us a resulting string pattern (e.g. 'OAOA'). We then "consume" arguments using this string pattern, handling both positionals (via the consume_positionals() closure) and optionals (via consume_optional()). The bug arises in the latter. consume_optional() processes options by building a regex based on the option's nargs (e.g. '-*A?-*' for nargs='?') and then runs this regex against the remainder of the string pattern (i.e. anything following the option flag). This is done via _match_argument(). The regex will always stop at the next option flag ('O' token) but for non-fixed nargs values like '?' and '+' may greedily consume every positional ('A' token) up to that point. This fix works by manipulating the string pattern as part of our optional consumption. Any 'A' tokens that are required by remaining positionals are masked to 'O' to prevent the regex consuming them. Masking will only consider tokens up to the next option flag ('O') and it both accounts for what future options can absorb (to avoid masking more than is necessary) and ensures that at least the minimum arguments required for the optional are actually consumed. In addition, we also handle the parse_intermixed_args() case. In intermixed mode, positional-typed arguments already collected to the left of the current option are also accounted for, since they will satisfy positionals in the second-pass parse. Signed-off-by: Stephen Finucane <[email protected]>

Shrey-N · 2026-03-31T03:57:46Z

Hiya @stephenfin, shouldn't we pre calculate the required positional counts instead of re scanning them inside the optional loop? I am a little worried about $O(N^2)$ performance on long argument lists.

savannahostrowski

Thanks for taking a look at this. I've tested this very extensively and I think it looks pretty solid from a UX perspective.

From a performance perspective, I ran a stress-test benchmark (all nargs='?' options with a required positional) scaling from 10 to 100 options. Even at 100 options, it ran in under 1ms, and realistically, CLIs have far, far fewer options. IMO, this is not a practical concern, but we could explore some optimizations like precomputing the sorted option indices?

Aside from this, I will say that any bigger change to argparse makes me a bit uneasy. Mainly because people are likely working around this bug and this will be a silent behaviour change (i.e. existing programs will parse differently without any error or warning). We definitely need to add a What's New entry (not just a NEWS blurb) calling this out, so users upgrading to 3.15 are aware.

I also left a few comments below.

savannahostrowski · 2026-04-03T19:53:45Z

+        parser.add_argument('baz')
+        parser.add_argument('bax')
+        args = parser.parse_args(['--foo', 'a', '--bar', 'b'])
+        self.assertEqual(args.baz, 'a')


Can we make sure that every test asserts both positionals and optionals? We should check to make sure that foo and bar ended up with the correct values. There are several tests that would benefit from this.

savannahostrowski · 2026-04-03T19:54:25Z

    ]

+    def test_does_not_steal_required_positional(self):
+        # https://github.com/python/cpython/issues/53584


Can you please remove the GitHub issue comment on each test? It's a bit noisy.

Will do. I was copying what I'd seen in some other tests here.

savannahostrowski · 2026-04-03T20:09:36Z

+                        extras_arg_count = sum(
+                            1 for c in extras_pattern if c == 'A'
+                        )
+                        min_pos = max(0, min_pos - extras_arg_count)


Suggested change

extras_arg_count = sum(

1 for c in extras_pattern if c == 'A'

)

min_pos = max(0, min_pos - extras_arg_count)

min_pos = max(0, min_pos - extras_pattern.count('A'))

TIL str.count is a thing...

savannahostrowski · 2026-04-03T20:15:46Z

@@ -664,6 +664,17 @@ class TestOptionalsNargs3(ParserTestCase):
        ('-x a b c', NS(x=['a', 'b', 'c'])),
    ]



We should add at least one more test for something like ['--foo', '--', 'abc'] with nargs='?' and a required positional, I think.

Makes sense. I'll tack this on.

bedevere-app · 2026-04-03T20:47:28Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

stephenfin · 2026-04-09T10:36:54Z

@savannahostrowski Thanks for the review. I'm trying to find a half day to come back and make the necessary changes for things you've called out, but in the interim...

Thanks for taking a look at this. I've tested this very extensively and I think it looks pretty solid from a UX perspective.

🥳 This is something like my 5th attempt at this over the years 😅 (dating back to Mon Dec 24 18:36:40 2018 +0000 if my local repo is to be believed...) so good to be getting somewhere...

From a performance perspective, I ran a stress-test benchmark (all nargs='?' options with a required positional) scaling from 10 to 100 options. Even at 100 options, it ran in under 1ms, and realistically, CLIs have far, far fewer options. IMO, this is not a practical concern, but we could explore some optimizations like precomputing the sorted option indices?

Can do, though I should note that I'm already unhappy with how complicated this is (a necessary evil to avoid changing any APIs or general behavior) and would be nervous to increase that complexity further. I'll see what I can do.

Aside from this, I will say that any bigger change to argparse makes me a bit uneasy. Mainly because people are likely working around this bug and this will be a silent behaviour change (i.e. existing programs will parse differently without any error or warning).

It's a fair concern, but I should emphasise here that the existing greedy behavior always resulted in an error in the corner cases this bug focuses on. I've intentionally structured the PR with two commits to indicate this (in case you looked at the overall change rather than the per-commit view, the first commit adds tests with unittest.expectedFailure decorator while the second commit fixes the issue and removes said decorator). The lack of changes to any other test in what is very large test suite is pretty indicative also.

We definitely need to add a What's New entry (not just a NEWS blurb) calling this out, so users upgrading to 3.15 are aware.

Should I make this change here in the PR or can I do it in a follow-up PR? I know that changes can take some time to merge, and I'm nervous this will introduce a high likelihood of merge conflicts (since that file presumably changes quite often) and put me on merge/rebase treadmill. I also assume this request means you're thinking this is likely not a viable target for backporting?

johnslavik · 2026-04-09T10:40:33Z

Should I make this change here in the PR or can I do it in a follow-up PR?

We usually do it in the same PR.

and put me on merge/rebase treadmill

We usually merge, conflicts in what's new (if any) are easy to resolve, fear not.

stephenfin added 2 commits March 27, 2026 11:17

pythongh-53584: Add reproducers for greedy opts issue

c2c4dfa

Most of these are marked as expected fail for now, pending the fix. Signed-off-by: Stephen Finucane <[email protected]>

stephenfin requested a review from savannahostrowski as a code owner March 27, 2026 12:06

bedevere-app Bot mentioned this pull request Mar 27, 2026

argparse optionals with nargs='?', '*' or '+' can't be followed by positionals #53584

Open

bedevere-app Bot added the awaiting review label Mar 27, 2026

stephenfin mentioned this pull request Mar 27, 2026

gh-85264: Optionally request actual filesize via 'os.path.getsize' #21088

Open

savannahostrowski requested changes Apr 3, 2026

View reviewed changes

bedevere-app Bot removed the awaiting review label Apr 3, 2026

bedevere-app Bot added the awaiting changes label Apr 3, 2026

johnslavik self-requested a review April 9, 2026 10:37

		@@ -664,6 +664,17 @@ class TestOptionalsNargs3(ParserTestCase):
		('-x a b c', NS(x=['a', 'b', 'c'])),
		]

Uh oh!

Conversation

stephenfin commented Mar 27, 2026

Uh oh!

Shrey-N commented Mar 31, 2026

Uh oh!

savannahostrowski left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

savannahostrowski Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stephenfin Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

savannahostrowski Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

stephenfin Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

savannahostrowski Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

stephenfin Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

savannahostrowski Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

stephenfin Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

bedevere-app Bot commented Apr 3, 2026

Uh oh!

stephenfin commented Apr 9, 2026

Uh oh!

johnslavik commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

savannahostrowski left a comment •

edited

Loading

savannahostrowski Apr 3, 2026 •

edited

Loading

johnslavik commented Apr 9, 2026 •

edited

Loading