bpo-34270: added the ability to name asyncio tasks#8547
Conversation
The ability to name individual tasks helps developers debug complex asyncio applications. Co-authored-by: Antti Haapala <[email protected]>
|
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. When your account is ready, please add a comment in this pull request You can check yourself Thanks again for your contribution, we look forward to reviewing it! |
|
I've updated my account to include my github name, so the CLA label should be cleared. |
|
I'll review this tomorrow |
|
Co-author here (CLA signed) |
zooba
left a comment
There was a problem hiding this comment.
Wait for Yury, but it looks good to me. One question about the need to convert names to str on assignment, but not a big deal.
|
|
||
| @name.setter | ||
| def name(self, value): | ||
| self._name = str(value) |
There was a problem hiding this comment.
Is forcing this to a str really important enough to need a property? The name is basically only ever used for printing anyway, which will do the conversion when necessary.
There was a problem hiding this comment.
This is consistent with the behavior of threading.Thread. Some code might also access the name property directly, expecting a str.
There was a problem hiding this comment.
As I mentioned above I don't think we need a setter. A Task.get_name() function is enough for now.
There was a problem hiding this comment.
You can set names of threads and processes too, why not tasks as well? Where's the harm in that? I can think of a few use cases for this, like displaying the status of a task as it progresses.
I could convert this to java-style setters and getters if you want, despite how much I despise them (not to mention the fact that threading also switched away from this behavior). But your call.
There was a problem hiding this comment.
I can think of a few use cases for this, like displaying the status of a task as it progresses.
Hm, interesting. OK, that's at least one valid use-case :)
I could convert this to java-style setters and getters if you want, despite how much I despise them (not to mention the fact that threading also switched away from this behavior). But your call.
It's not about liking Java-style getters or not, it's about consistency. If asyncio's APIs were using @property it'd be equally happy about that, but now we better follow the overall style and use get_name() and set_name().
| from .coroutines import coroutine | ||
|
|
||
| # Helper to generate new task names | ||
| _counter = itertools.count(1).__next__ |
There was a problem hiding this comment.
Using itertools.count() seems like a bit of an overkill to me. I'd simply do an explicit increment. Also, _task_name_counter would be a clearer name.
There was a problem hiding this comment.
Indeed this seems like an overkill. There's no reason to make Task creation any slower than it has to be; keep in mind that PyPy uses the pure Python version. Just replace this with a simple int increment.
There was a problem hiding this comment.
This is what the threading module does, and it uses itertools.count() because _counter += 1 is not thread safe. The same applies to asyncio if there are more than one thread running an event loop (since the counter is global).
There was a problem hiding this comment.
_counter += 1 is not thread safe.
It is thread safe in Python and will likely always be with or without the GIL.
There was a problem hiding this comment.
It is thread safe in Python and will likely always be with or without the GIL.
What output does this script give you then?
There was a problem hiding this comment.
Hmmm... thank you for pointing this out! ;)
To my point: += is actually atomic, but storing the updated number in globals is another operation and we have a race between them. This actually deserves a comment in the code explaining why we use counter() and not +=.
| ----- | ||
|
|
||
| .. method:: AbstractEventLoop.create_task(coro) | ||
| .. method:: AbstractEventLoop.create_task(coro, \*, name=None) |
There was a problem hiding this comment.
We also need to update the documentation of set_task_factory
| get_stack(). The file argument is an I/O stream to which the output | ||
| is written; by default output is written to sys.stderr. | ||
|
|
||
| .. attribute:: name |
There was a problem hiding this comment.
I care way more about consistency of asyncio APIs than about maintaining any kind of artificial similarity between Task and Thread. And in asyncio we don't use properties that much. Please replace the name property with a Task.get_name() method. There's no use case for adding set_name() though, so let's not add it at least in 3.8.
|
|
||
| if task._fut_waiter is not None: | ||
| info.insert(2, f'wait_for={task._fut_waiter!r}') | ||
| info.insert(3, f'wait_for={task._fut_waiter!r}') |
There was a problem hiding this comment.
BTW, feel free to make another pull request to refactor this code. I don't like us modifying some list object with insert() calls.
There was a problem hiding this comment.
How would you prefer that we display the name of the task in repr() then?
There was a problem hiding this comment.
It would be great to refactor the code that formats things (tasks, callbacks, reprs) etc somehow. Right now it feels hacky and it's one of the areas of asyncio code that I'd like to be rewritten to be clearer. Discussing that is slightly out of the scope of this PR though, but if you have any ideas how to avoid mutating lists and having a non-linear formatting logic I'd be happy to accept a new PR. For this PR, your current code is totally fine!
| from .coroutines import coroutine | ||
|
|
||
| # Helper to generate new task names | ||
| _counter = itertools.count(1).__next__ |
There was a problem hiding this comment.
Indeed this seems like an overkill. There's no reason to make Task creation any slower than it has to be; keep in mind that PyPy uses the pure Python version. Just replace this with a simple int increment.
| self._log_destroy_pending = False | ||
| raise TypeError(f"a coroutine was expected, got {coro!r}") | ||
|
|
||
| self._name = str(name) if name is not None else 'Task-%s' % _counter() |
There was a problem hiding this comment.
This is a bit too hard to read. I'd replace it with an if statement.
|
|
||
| @name.setter | ||
| def name(self, value): | ||
| self._name = str(value) |
There was a problem hiding this comment.
As I mentioned above I don't think we need a setter. A Task.get_name() function is enough for now.
| static volatile uint64_t cached_running_holder_tsid; | ||
|
|
||
| /* Counter for autogenerated Task names */ | ||
| static unsigned long long task_name_counter = 0; |
There was a problem hiding this comment.
I had that at first but @ztane pointed out that the format string explicitly requires an unsigned long long and uint64_t may or may not be the same size. Are you sure this is ok?
There was a problem hiding this comment.
Got it. AFAIK unsigned long long is guaranteed to be large enough to store uint64_t. Using the latter makes the code more consistent, so I'd go for it anyways.
You can use this format specifier to format uint64_t specifically:
printf("... %" PRIu64 "...", (uint64_t)1);| } else { | ||
| name = PyObject_Str(name); | ||
| } | ||
|
|
There was a problem hiding this comment.
Please remove this new line to make it easier for the reader to see that we handle the case of name==NULL
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
|
BTW, please also add a What's New entry for 3.8. |
|
I like the idea! |
* Removed the public *name* property * Added the *Task.get_name()* method * Modified the documentation of *set_task_factory()* to include the *name* parameter * Whitespace changes
|
I have made the requested changes; please review again. |
|
I have made the requested changes; please review again. |
| raise TypeError(f"a coroutine was expected, got {coro!r}") | ||
|
|
||
| if name is None: | ||
| self._name = 'Task-%s' % _task_name_counter() |
There was a problem hiding this comment.
nit: I've recently updated asyncio to use f-strings. Please change this one to either an f-string or to a .format() call (just for maintaining consistency of the codebase).
| task._log_destroy_pending = False | ||
| coro.close() | ||
|
|
||
| def test_create_named_task_with_custom_factory(self): |
There was a problem hiding this comment.
Please also add test_create_named_task_with_default_factory. That would test a bug we fixed in this PR when the default task factory didn't pass the name=name to Task().
|
Almost there. Please fix the two remaining nit comments and I'll merge this. |
1st1
left a comment
There was a problem hiding this comment.
Just spotted an error, a PyObject* function returns -1. Please also check that the C compiler generates 0 warnings for your patch in _asynciomodule.c.
| { | ||
| PyObject *name = PyObject_Str(value); | ||
| if (name == NULL) { | ||
| return -1; |
There was a problem hiding this comment.
Oups, need to fix this to return NULL!
There was a problem hiding this comment.
With make clean; CFLAGS=-Wno-cast-function-type make, compilation of the _asyncio extension produced no warnings after the patch.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
|
I have made the requested changes; please review again. |
|
Merged. Thanks for much for pushing this through, Alex! |
The ability to name individual tasks helps developers debug complex
asyncio applications.
Co-authored-by: Antti Haapala [email protected]
https://bugs.python.org/issue34270