orc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dain Sundstrom <d...@iq80.com>
Subject String stats requirements?
Date Tue, 06 Jun 2017 22:02:12 GMT
Is it required that the StringStatistics min and max be the actual min and max value for the
column?  I ask for two reasons, I’d like to be able to “trim” values if the min or max
is very large.  Also, as a work around of for the UTF-16be sorting problem (bug?), I’d like
to trim values at the first surrogate pair, so the value is slightly smaller than the min
or larger than the max, and still a valid UTF-8 sequence.

Thoughts?

-dain


Mime
View raw message