bpo-32107 - Better merge of #4494#4576
Conversation
The original check was erroneous in two ways. First, the documentation said that "47 bit will never be set in IEEE 802 addresses obtained from network cards". I think this is referring to the universally vs. locally administered MAC addresses. Network cards from hardware manufacturers will always be universally administered in order to guarantee global uniqueness. But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is the "distinguished by setting the second-least-significant bit of the first octet of the address", where a 0 means universally administered and a 1 means it's locally administered. This works out to the 42nd bit when counting the LSB as bit 1, or 1<<41. The second bug is that the original bitmask value isn't right for either description: % ./python.exe -c "from math import log2; print(log2(0x010000000000))" 40.0 This causes the test to fail on valid MAC addresses. Fix this by improving the comment, with references, and using the correct bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally administered or not. * In the various MAC getter functions, keep track of the first locally administered MAC, and if no universally administered MAC is found, return the first local MAC. * In the ultimate fallback _random_getnode(), add a comment for the multicast bit being set, and use a better bitmask * In getnode(), just fall back to _random_getnode() explicitly if no other MAC has been found.
While it is required that the multicast bit (1<<40) should be set in random generated addresses, there is no requirement that it should be cleared in addresses obtained from network cards.
The original check was erroneous in two ways. First, the documentation said that "47 bit will never be set in IEEE 802 addresses obtained from network cards". I think this is referring to the universally vs. locally administered MAC addresses. Network cards from hardware manufacturers will always be universally administered in order to guarantee global uniqueness. But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is the "distinguished by setting the second-least-significant bit of the first octet of the address", where a 0 means universally administered and a 1 means it's locally administered. This works out to the 42nd bit when counting the LSB as bit 1, or 1<<41. The second bug is that the original bitmask value isn't right for either description: % ./python.exe -c "from math import log2; print(log2(0x010000000000))" 40.0 This causes the test to fail on valid MAC addresses. Fix this by improving the comment, with references, and using the correct bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally administered or not. * In the various MAC getter functions, keep track of the first locally administered MAC, and if no universally administered MAC is found, return the first local MAC. * In the ultimate fallback _random_getnode(), add a comment for the multicast bit being set, and use a better bitmask * In getnode(), just fall back to _random_getnode() explicitly if no other MAC has been found.
…torchaka/cpython into test_uuid This merges the relevant bits of python#4572 and should be considered instead of that PR.
The original check was erroneous in two ways. First, the documentation said that "47 bit will never be set in IEEE 802 addresses obtained from network cards". I think this is referring to the universally vs. locally administered MAC addresses. Network cards from hardware manufacturers will always be universally administered in order to guarantee global uniqueness. But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is the "distinguished by setting the second-least-significant bit of the first octet of the address", where a 0 means universally administered and a 1 means it's locally administered. This works out to the 42nd bit when counting the LSB as bit 1, or 1<<41. The second bug is that the original bitmask value isn't right for either description: % ./python.exe -c "from math import log2; print(log2(0x010000000000))" 40.0 This causes the test to fail on valid MAC addresses. Fix this by improving the comment, with references, and using the correct bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally administered or not. * In the various MAC getter functions, keep track of the first locally administered MAC, and if no universally administered MAC is found, return the first local MAC. * In the ultimate fallback _random_getnode(), add a comment for the multicast bit being set, and use a better bitmask * In getnode(), just fall back to _random_getnode() explicitly if no other MAC has been found.
While it is required that the multicast bit (1<<40) should be set in random generated addresses, there is no requirement that it should be cleared in addresses obtained from network cards.
| _arp_getnode, _lanscan_getnode, _netstat_getnode] | ||
|
|
||
| for getter in getters + [_random_getnode]: | ||
| for getter in getters: |
There was a problem hiding this comment.
The difference is that exceptions in _random_getnode() no longer ignored. Ignoring all exceptions is not a good idea, but what exceptions can raise _random_getnode()?
There was a problem hiding this comment.
I think the only possibility is an import error. Ultimately, random.getrandbits() boils down to os.urandom() which I think can block but AFAICT won't raise an exception. And it bothers me (as I think it bothers you) that all exceptions in all those getters are just swallowed. Here's my thinking: since I'm not sure of the best way to report any errors in the other getters, and since the last getter is _random_getnode(), and since it's pretty unlikely that the function will experience an ImportError, and if it does then I think it's better to report that than to silently swallow the exception and return None from getnode(), since that will lead to even more mysterious bugs, I think letting any rare exceptions in _random_getnode() not be swallowed is the better choice.
There was a problem hiding this comment.
It's LGTM. I just wanted to make sure that this is an intentional change.
| # `test_random_getnode()` method specifically. Another case is the | ||
| # Travis-CI case, which apparently only has locally administered MAC | ||
| # addresses. | ||
| if not random and not os.getenv('TRAVIS'): |
There was a problem hiding this comment.
If add this test then restore the network parameter. check_node() is called for addresses obtained from network cards, for random addresses, but it also is called in cases where it is not known (and is not important) whether the address is real or random.
There was a problem hiding this comment.
However, if you look at the original code, every call to check_node() except the one in test_random_getnode() called it with network=True. If I do add it back, which I'm willing to do, then I propose that we call this argument multicast, default it to False, and make it a keyword-only argument. I will see how that looks.
There was a problem hiding this comment.
And I will change random to local.
There was a problem hiding this comment.
There are three tests in which check_node() is called without network=True. Only one of them explicitly tests _random_getnode(). In the other twos we can't say what is the source of the address. We shouldn't test either 1<<40 nor 1<<41 bits in that cases.
| "%s is not an RFC 4122 node ID" % hex) | ||
|
|
||
| @unittest.skipUnless(os.name == 'posix', 'requires Posix') | ||
| @unittest.skipIf(os.getenv('TRAVIS'), |
There was a problem hiding this comment.
Don't skip this test. It should test that corresponding method don't raise an exception, crash or hang. And returns a reasonable integer or None. Just skip additional checks.
There was a problem hiding this comment.
Yep, I meant to remove those skipIfs; I'll push an update for that.
* Restored the check for the multicast bit. * Remove some now-extraneous skipIfs.
| if not random and not os.getenv('TRAVIS'): | ||
| if not local and not os.getenv('TRAVIS'): | ||
| self.assertFalse(node & (1 << 41), '%012x' % node) | ||
| is_multicast = (node & (1 << 40)) |
There was a problem hiding this comment.
We shouldn't test the multicast bit for _unix_getnode() and _windll_getnode(). Actually we should test it only for _random_getnode(), because the other methods can return an address with this bit set or clear (your original issue).
| # Travis-CI case, which apparently only has locally administered MAC | ||
| # addresses. | ||
| if not random and not os.getenv('TRAVIS'): | ||
| if not local and not os.getenv('TRAVIS'): |
There was a problem hiding this comment.
We shouldn't test the multicast bit for _unix_getnode() and _windll_getnode().
There was a problem hiding this comment.
Good point. I'm removing the multicast bit check from check_node() since only the test for _random_getnode() needs to check that.
* Rename `local` to `admin` and use this to control whether the local vs. universal admin bit is set. Generally we do want to test that, except for the random case, and the two cases where we don't know what the provenance of the MAC address is. * We only need to check the multicast bit in the random test.
| def check_node(self, node, requires=None, *, | ||
| # Additional bitmask checks to perform. | ||
| local=False, multicast=False): | ||
| def check_node(self, node, requires=None, *, admin=True): |
There was a problem hiding this comment.
The meaning of the admin flag is less obvious than the meaning of the opposite network flag. I expected this is something related to administrative privileges. Could you add an elaboration?
There was a problem hiding this comment.
For me, network wasn't really descriptive either. Something like check_universally_administered_bit would be more descriptive but prohibitively long. We could go back to local or switch the parity to universal. I'm open to other suggestions.
| self.check_node(node, 'unix') | ||
| # Since we don't know the provenance of the MAC address, don't check | ||
| # whether it is locally or universally administered. | ||
| self.check_node(node, 'unix', admin=False) |
There was a problem hiding this comment.
I would prefer to not do non-guarantied tests by default, and do them only if this is specified explicitly.
There was a problem hiding this comment.
Hmm, I obviously mildly (+0) prefer it the other way around. Would your opinion change if we named the flag universal?
There was a problem hiding this comment.
This is your check, I left it on to you.
There was a problem hiding this comment.
I chose check_bit as the parameter name, but didn't change the default value. I'm happy with this.
|
@serhiy-storchaka Thanks for the great review! |
https://bugs.python.org/issue32107