stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Farid Zaripov <Farid_Zari...@epam.com>
Subject [PATCH] collate.cpp (was: RE: Localedef assertion failure on Windows)
Date Mon, 16 Apr 2007 18:35:32 GMT
 > -----Original Message-----
 > From: Martin Sebor [mailto:sebor@roguewave.com]
 > Sent: Tuesday, January 09, 2007 7:29 PM
 > To: stdcxx-dev@incubator.apache.org
 > Subject: Re: Localedef assertion failure on Windows
 >
 > Andrew Black wrote:
 > > Greetings all.
 > >
 > > When building the UTF-8 locales on windows with the debug
 > version of
 > > the  localedef utility, the localedef utility terminates
 > with a failed
 > > assertion within the library (in __rw_debug_iter::operator*() in
 > > _iterbase.h).  Within collate.cpp, the failure occurs on line 579.
 > >
 > > A trace of the code
 >
 > It might be helpful to see the stack trace.
 >
 > > indicates that the last good iteration across this line is
 > iteration
 > > number 56677, for the token 'UFFFD'.
 >
 > I assume this on line 23337 of UTF-8.
 >
 > > The following
 > > token (<U00010300>) fails because
 > __rw_debug_iter::_C_is_end() returns
 > > true.  However, my reading of collate.cpp is that this condition
 > > shouldn't happen, as the termination condition of loop
 > containing the
 > > statement in question is suppose to terminate when this
 > condition is
 > > reached.
 > >
 > > Does this indicate a flaw in std::map or something else?
 >
 > More likely, in collate.cpp or somewhere in the rest of localedef.
 > I suspect it has to do with wchar_t being only 16 bits wide
 > on Windows and the character map containing characters (such as
 > <U00010300>) beyond that range. To fix this we'll either need
 > to replace wchar_t with a 32-bit type or ignore characters
 > that do not fit in 16 bits on Windows (and wherever else wchar_t isn't
 > 32 bits, such as AIX).

  Today I have checked this problem.

  As I see when localedef processed the charmap file 
(Charmap::process_chars()),
the Charmap::add_to_cmaps() invoked for each character in CHARMAP section.
Here the symbol name is added to the symnames_list_, but characted is 
not added
to the maps w_cmap_, rw_cmap_, mb_cmap_, rmb_cmap_ because of
convert_to_wc() returns convert_to_ucs() which is returns false.

  Then in Def::add_missing_values() w_cmap.find() returns w_cmap.end(), 
because of
character were not inserted (see above). But this iterator dereferenced 
without checking.

  The proposed patch is attached.

  Another thing is: why we first iterating through the 
charmap_.get_symnames_list() and then
searching the symbol in charmap_.get_w_cmap() instead of just iterating 
through the
charmap_.get_w_cmap()?

Farid.


Mime
View raw message