stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Sebor <se...@roguewave.com>
Subject Re: expectation vs requirements for locale facets
Date Wed, 22 Aug 2007 19:24:28 GMT
Travis Vitek wrote:
>  
[...]
> Not sure about that. If you write a date out using POSIX strftime(...,
> "%x", ...) and you can read it back using POSIX strptime(..., "%x",
> ...), then I would hope that you could do so using time_put and
> time_get.

Yes. *If* strptime() can parse it, so should time_get (modulo
the multibyte character issue).

[...]
>> Yes, but don't the get_time() and get_date() functions also
>> say that they only parse the output produced by "%H:%M:%S"
>> and "%m/%d/%y" (or some such combination of the individual
>> directives)?
> 
> No, I haven't seen anything that indicates that is the case. The only
> requirement that I've seen is that on do_get_[date,time]() that I've
> quoted several times. It just says that it must extract the struct tm
> members and read the format characters used by time_put<>::put to
> produce the format specified by 'x' ['X'] or until it encounters an
> error.

Darn it! Those committee people! They've slipped in a change
behind my back that I've been assuming was part of the original
standard when it's just in the latest working paper. That should
teach me to use the latest and greatest!

What I was referring to is what is now Table 84 in the latest
working draft of the standard:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2369.pdf

The change comes from issue 461:
http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#461

So unless we manage to get the resolution of issue 461 reverted
or changed, the next standard prevents get_date() from correctly
parsing dates formatted using %x. ...or unless we can interpret
the sentence "An implementation may also accept additional
implementation-defined formats." as giving us the freedom
to accept "%m/%e/%y" in addition to "%m/%d/%y".

But regardless of what the next standard does or how we decide
to interpret it, the current standard does seem to require that
get_date() accept strings produced by strftime("%m/%e/%y") if
that's what %x expands to. In at least one of the locales that
caused your test case to fail, bg_BG, the %e was in fact the
reason since %x expands to "%e.%m.%Y"

So I think we may have gotten sidetracked by the discussion
of %x and %X when the problem is actually much simpler: does
%e skip leading space or not? Or, more generally, is the facet
(and strptime) required to skip leading space when parsing
numeric data? Once we have our answer to those questions we
will be able to deal with this issue: either we have a bug
in time_get or one (or more) in the time_get tests.

[...]
> Now this begs the question. I saw that we had the time_get<>::get()
> extension in there. I'm wondering if the mt test should just be using
> this as all of the other get method variants end up invoking do_get() in
> the end. It would simplify the test a little bit, but it really offers
> no other benefit.

I think we should make calls to all the public functions in case
the more specific ones happen to be implemented independently of
the general one (e.g., for efficiency).

[...]
> It isn't just %e that exhibits this behavior. Adding leading whitespace
> to all of the fields appears to work on the same platforms...
> 
>     const char* fmt = "-%H:%M:%S-%m/%e/%Y-";
>     const char* buf = "- 1: 2: 3- 4/ 5/ 1906-";

Yes, I noticed it too.

[...]
> I imagine that we'll file a bug and start eating leading whitespace in
> either case, right?

File a bug for sure and let's see about how to fix it and mainly
*when* to fix it later, after we get some feedback from the POSIX
and C++ committees, and after we've had some time to digest it.

Here's the thread I started on the Austin Group list for reference:
https://www.opengroup.org/sophocles/show_archive.tpl?source=L&listname=austin-group-l&first=1&pagesize=80&searchstring=strptime%28%29+and+leading+space&zone=G

Martin

Mime
View raw message