This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lys.nikolaou
Recipients christian.heimes, lys.nikolaou, pablogsal
Date 2020-06-12.15:04:54
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <[email protected]>
In-reply-to
Content
> Note that although we could just exit if the length of the line is smaller than the column offset before calling https://github.com/python/cpython/blob/master/Parser/pegen.c#L148 (I assume that is the problem) is more important to understand how are we reaching that situation.

That's because of https://github.com/python/cpython/blob/e2fb8a2c42ee60c72a40d93da69e9efc4e359023/Parser/pegen.c#L404, which decreases the size of the string to be decoded by 1 if the last character is a newline (or, equivalently, if the error offset points past the end of the line). Thus, PyUnicode_DecodeUTF8 returns an object that is one character shorter than the col_offset and that's how we get to the situation you mentioned.
History
Date User Action Args
2020-06-12 15:04:54lys.nikolaousetrecipients: + lys.nikolaou, christian.heimes, pablogsal
2020-06-12 15:04:54lys.nikolaousetmessageid: <[email protected]>
2020-06-12 15:04:54lys.nikolaoulinkissue40958 messages
2020-06-12 15:04:54lys.nikolaoucreate