drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yash Sharma <yash...@gmail.com>
Subject Re: OpenJDK’s java.utils.Collection.sort() Bug
Date Thu, 26 Feb 2015 12:32:29 GMT
Makes sense.
We just need to keep in mind that we don't use collection.sort for sorting
actual data. Otherwise we should never hit this bug.

On Thu, Feb 26, 2015 at 4:28 PM, Steven Phillips <sphillips@maprtech.com>
wrote:

> It looks like we are using the method in 5 different places in drill. We
> are using to sort lists of: files, drillbit endpoints, workunits, operator
> profiles, and columnIds.
>
> I can't imagine we are ever going to need to sort millions of those. So
> probably no need to worry about this bug.
>
> But we should keep it in mind for any future code that might want to use
> it.
>
> On Thu, Feb 26, 2015 at 1:00 AM, Yash Sharma <yash360@gmail.com> wrote:
>
> > As pointed out on the Hadoop mailing list -
> >
> > The OpenJDK’s java.utils.Collection.sort() is broken - such that the
> > default TimSort implementation would cause ArrayIndexOutOfBoundsException
> > for number of elements larger than 67108864.
> >
> > I wonder if we can have such a huge collection in Drill and might hit
> this
> > bug ?
> > We do have Collections.sort used in multiple places
> > including DrillTextRecordReader but do we need to consider workaround for
> > this ?
> >
> > Thoughts ?
> >
> > Links:
> > http://envisage-project.eu/timsort-specification-and-verification/
> >
> > https://bugs.openjdk.java.net/browse/JDK-8072909
> >
>
>
>
> --
>  Steven Phillips
>  Software Engineer
>
>  mapr.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message