bpo-15873: add '.fromisoformat' for date, time and datetime#4841
bpo-15873: add '.fromisoformat' for date, time and datetime#4841deronnax wants to merge 1 commit intopython:masterfrom
Conversation
|
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. Thanks again to your contribution and we look forward to looking at it! |
SylvainDe
left a comment
There was a problem hiding this comment.
It looks mostly good to me. I've added a few comments which may or may not be relevant.
| if (!PyArg_ParseTuple(args, "U:fromisoformat", &string)) | ||
| return NULL; | ||
|
|
||
|
|
There was a problem hiding this comment.
You've used only one blank line in the other places (which is IMHO more beautiful).
Also, do you reckon it would make sense to have a single function like:
static PyObject *
strptime_fromisoformat(PyObject *cls, PyObject *args, const char *method_name)
{
static PyObject *module = NULL;
PyObject *string;
if (!PyArg_ParseTuple(args, "U:fromisoformat", &string))
return NULL;
if (module == NULL) {
module = PyImport_ImportModule("_strptime");
if (module == NULL)
return NULL;
}
return PyObject_CallMethod(module, method_name, "OO", cls, string);
}
static PyObject *
date_fromisoformat(PyObject *cls, PyObject *args)
{
return strptime_fromisoformat(cls, args, "_parse_isodate");
}
static PyObject *
time_fromisoformat(PyObject *cls, PyObject *args)
{
return strptime_fromisoformat(cls, args, "_parse_isotime");
}
static PyObject *
datetime_fromisoformat(PyObject *cls, PyObject *args)
{
return strptime_fromisoformat(cls, args, "_parse_isodatetime");
}
|
|
||
| @classmethod | ||
| def fromisoformat(cls, date_string): | ||
| """Constructs a date from an RFC 3339 string, a strict subset of ISO 8601. |
There was a problem hiding this comment.
Convention is to use imperative form ("Construct"). This comment applies in other places.
| r'(?P<microsecond>\.\d+)?(?P<tzinfo>Z|[+-]\d{2}:\d{2})?$', | ||
| ASCII|IGNORECASE) | ||
|
|
||
| _datetime_re = re_compile(_date_re.pattern[:-1] + r'[T ]' + _time_re.pattern, |
There was a problem hiding this comment.
Maybe, it be easier to define tmp variables to avoid dealing the indexing here:
date_pattern = r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})'
time_pattern = r'(?P<hour>\d{2}):(?P<minute>\d{2}):(?P<second>\d{2})(?P<microsecond>\.\d+)?(?P<tzinfo>Z|[+-]\d{2}:\d{2})?'
datetime_pattern = date_pattern + r'[T ]' + time_pattern
date_re = re_compile(date_pattern + '$', ASCII)
time_re = re_compile(time_pattern + '$', ASCII | IGNORECASE)
datetime_re = re_compile(datetime_pattern + '$', ASCII | IGNORECASE)
There was a problem hiding this comment.
I'll note that this won't match all the outputs of datetime.isoformat, if that's a goal of this version of datetime.fromisoformat, since the sep parameter does accept non-ASCII components.
That said, it seems like _date_re is requiring a full specification of YYYY-MM-DD, in which case you can just split the string as:
date_str = dt_str[0:10]
time_str = dt_str[11:]
dt_sep = dt_str[10:11]It's then kinda trivial to enforce whatever restrictions you want on dt_sep, and continue to use the ASCII flag for date_re and time_re.
|
|
||
| Raises ValueError in case of ill-formatted or invalid string. | ||
| """ | ||
| import _strptime |
There was a problem hiding this comment.
Any reason not to have this at the top level ?
|
|
||
| Return a date object corresponding to *date_string*, according to RFC 3339, | ||
| a stricter, simpler subset of ISO 8601, and such as is returned by | ||
| :func:`date.isoformat`. Microseconds are rounded to 6 digits. |
There was a problem hiding this comment.
Shouldn’t microseconds in a date string be illegal? Maybe clarify which parts of the RFC are relevant; perhaps the full-date format?
| :exc:`ValueError` is raised if *date_string* is not a valid RFC 3339 | ||
| date string. | ||
|
|
||
| .. `RFC 3339`: https://www.ietf.org/rfc/rfc3339.txt |
There was a problem hiding this comment.
How does this get marked up? I tend to prefer the IETF’s HTML versions (e.g. https://tools.ietf.org/html/rfc3339), and a lot of the documentation seems to link to those directly via :rfc:`3339` syntax.
| a stricter, simpler subset of ISO 8601, and such as is returned by | ||
| :func:`time.isoformat`. Microseconds are rounded to 6 digits. | ||
| :exc:`ValueError` is raised if *time_string* is not a valid RFC 3339 | ||
| time string.. |
|
|
||
| Return a datetime object corresponding to *datetime_string*, according to RFC 3339, | ||
| a stricter, simpler subset of ISO 8601, and such as is returned by | ||
| :func:`datetime.isoformat`. Microseconds are rounded to 6 digits. |
There was a problem hiding this comment.
Perhaps it would be clearer to say something like “The time is rounded to a whole number of microseconds”?
Also, I suggest clarifying that the separator may be a space instead of T. This is suggested, but not required by the RFC profile, and means that the output of format(datetime) is supported.
| kw = {k: int(v) for k, v in kw.items()} | ||
| if us: | ||
| us = round(float(us), 6) | ||
| kw['microsecond'] = int(us * 1e6) |
There was a problem hiding this comment.
Both the time and datetime classes require microsecond to be strictly less than 1 million, and it looks like you don’t handle rolling over seconds, minutes, etc. Test case:
datetime.fromisoformat('2017-12-31 23:59:59.9999995') -> datetime(2018, 1, 1, 0, 0, 0)
For datetime, I might do the rolling over by adding a timedelta. For time, maybe it is not worth doing any rounding.
Also, float is approximate, e.g. for me float(".000_001_499_999_999_999_999_99") rounds up over 1.5 µs, which round would round up to 2 µs. I wonder if it is worth doing the rounding without float; then you could claim in the documentation that rounding is always half-to-even. Untested code:
_time_re = ... r'(?:(?P<microsecond>\.\d{1,6})(?P<us_frac>\d*))?' ...
...
us = int(float(us) * 1e6)
frac = kw.pop('us_frac').rstrip("0")
# Round halfway up to even rather than down to odd
us += frac > "5" or frac == "5" and us % 2|
Closing since an alternative implementation has been merged. See GH-4699. |
adding method 'fromisoformat' to date, time and datetime classes
https://bugs.python.org/issue15873