bpo-31324: Optimize support._match_test().#4420
bpo-31324: Optimize support._match_test().#4420serhiy-storchaka wants to merge 2 commits intopython:masterfrom
Conversation
vstinner
left a comment
There was a problem hiding this comment.
Apart the wrong re.I flag, LGTM.
I tested manually your PR: it works as expected.
https://bugs.python.org/issue31324#msg306362
| return True | ||
| if match_tests != _match_tests: | ||
| match_tests_re = '|'.join(map(fnmatch.translate, match_tests)) | ||
| _match_test1 = re.compile(match_tests_re, re.I).match |
There was a problem hiding this comment.
Hum no, re.I must no be used: the match is expected to be case sensitive, not case insensitve.
| return False | ||
| if _match_test1(test_id): | ||
| return True | ||
| return any(map(_match_test1, test_id.split("."))) |
There was a problem hiding this comment.
If all match_tests patterns contain a dot ".", we could make this code even simpler:
return match_test(test_id)
But I'm unable to see a major performance difference when running:
./python -m test.bisect --fail-env-changed -o bisect test_asyncio -v
I'm checking how much time it takes before running the first test: so the time to load and filter tests.
|
support._match_test() has at least a complexity of O(n) where n is the number of patterns in support.match_tests, just to check if support.match_tests was modified :-( I proposed PR #4421 which replaces support.match_tests global variable with a new support.set_match_tests() function to cache the "matcher" object and avoid the need of checking if patterns changed. Moreover, I implemented a very simple matcher using set() for the test.bisect use case: all patterns are "full identifiers". Using "./python -m test.bisect --fail-env-changed -o bisect test_asyncio -v" as a "benchmark": I see the first test running faster with my PR than using this PR. I guess that the problem bottleneck of this PR is the minimum O(n) cost because by mutable support.mtach_tests, whereas my benchmark uses 768 patterns. |
|
I merged PR #4421 which is based on this PR but goes further in term of optimization and adds news tests. |
https://bugs.python.org/issue31324