Message 64157 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	ocean-city
Recipients	ocean-city
Date	2008-03-20.06:15:09
SpamBayes Score	0.033387706
Marked as misclassified	No
Message-id	<[email protected]>
In-reply-to

Content
Following dirty hack workarounds this bug. Comment of this function says not ascii compatible encoding is not supported yet, (ie: UTF-16) so probably this works. Index: Parser/tokenizer.c =================================================================== --- Parser/tokenizer.c (revision 61632) +++ Parser/tokenizer.c (working copy) @@ -464,6 +464,7 @@ Py_XDECREF(tok->decoding_readline); readline = PyObject_GetAttrString(stream, "readline"); tok->decoding_readline = readline; + tok->lineno = -1; /* dirty hack */ cleanup: Py_XDECREF(stream); But if multibyte character is in line like this, its line will not be printed. # coding: cp932 # 1 raise RuntimeError("あいうえお") # 2 C:\Documents and Settings\WhiteRabbit>py3k cp932.py Traceback (most recent call last): File "cp932.py", line 3, in <module> [22819 refs] This is because Python/trackeback.c 's tb_displayline() assumes input line is encoded with UTF-8. (simply using FILE structure + Py_UniversalNewlineFgets) # http://mail.python.org/pipermail/python-3000/2008-March/012546.html # sounds nice, if we can replace all FILE structure to Python's own # fast enough codeced Reader or something.

Following dirty hack workarounds this bug. Comment of this function
says not ascii compatible encoding is not supported yet, (ie: UTF-16)
so probably this works.

Index: Parser/tokenizer.c
===================================================================
--- Parser/tokenizer.c	(revision 61632)
+++ Parser/tokenizer.c	(working copy)
@@ -464,6 +464,7 @@
 	Py_XDECREF(tok->decoding_readline);
 	readline = PyObject_GetAttrString(stream, "readline");
 	tok->decoding_readline = readline;
+	tok->lineno = -1; /* dirty hack */
 
   cleanup:
 	Py_XDECREF(stream);

But if multibyte character is in line like this, its line will not be
printed.

# coding: cp932
# 1
raise RuntimeError("あいうえお")
# 2

C:\Documents and Settings\WhiteRabbit>py3k cp932.py
Traceback (most recent call last):
  File "cp932.py", line 3, in <module>
    [22819 refs]

This is because Python/trackeback.c 's tb_displayline() assumes
input line is encoded with UTF-8. (simply using FILE structure +
Py_UniversalNewlineFgets)

# http://mail.python.org/pipermail/python-3000/2008-March/012546.html
# sounds nice, if we can replace all FILE structure to Python's own
# fast enough codeced Reader or something.

History
Date	User	Action	Args
2008-03-20 06:15:11	ocean-city	set	spambayes_score: 0.0333877 -> 0.033387706 recipients: + ocean-city
2008-03-20 06:15:10	ocean-city	set	spambayes_score: 0.0333877 -> 0.0333877 messageid: <[email protected]>
2008-03-20 06:15:10	ocean-city	link	issue2384 messages
2008-03-20 06:15:09	ocean-city	create