arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leif Walsh <leif.wa...@gmail.com>
Subject Re: SIMD support in Java
Date Thu, 25 Feb 2016 00:26:36 GMT
The JVM may be able to do popcount optimization but it's categorically bad
at other vectorization instructions.
On Wed, Feb 24, 2016 at 18:30 Taro L. Saito <leo@xerial.org> wrote:

> Thanks for letting me know.
>
> If we need to embed C++ binaries (.so files) inside java,
> snappy-java's approach https://github.com/xerial/snappy-java would be
> useful,
> which bundles .so files built for several OS/CPU architectures, and loads
> one of them at run-time.
>
> Btw, JVM is smart enough to replace Long.bitCount(long) (popcount) into a
> corresponding CPU operation.
> No JNI is necessary for this.
>
>
> On Wed, Feb 24, 2016 at 1:11 PM, Wes McKinney <wes@cloudera.com> wrote:
>
> > I will soon need some SIMD-enabled algorithms for hashing and
> > bitmap-related stuff like popcount in the C++ implementation; we might
> > prioritize a batchy JNI interface to Arrow C++ to use for cases where
> > the JNI overhead is worth paying from the Java side.
> >
> > On Wed, Feb 24, 2016 at 11:30 AM, Jacques Nadeau <jacques@apache.org>
> > wrote:
> > > The short answer is the JVM is horrible at SIMD. It does a few
> > > optimizations when working with primitive arrays but beyond that,
> you're
> > > basically stuck working outside the JVM. The key for Arrow is that the
> > > overhead of stepping out of the JVM can be amortized across all records
> > in
> > > a batch. I hope to contribute some examples of this cross use with
> > > benchmarks shortly but would love it if others also did some work here.
> > >
> > > On Wed, Feb 24, 2016 at 2:22 AM, Taro L. Saito <leo@xerial.org> wrote:
> > >
> > >> Hi,
> > >>
> > >> I have just started looking at the java code of Arrow. So far what I
> can
> > >> found is:
> > >>  - Code template is used to generate efficient codes for
> reading/writing
> > >> fixed bit-length value vectors
> > >>  - Unsafe class will be used to accelerate raw memory access within
> > >> ByteBuffer
> > >>  - ValueHolder class is used to avoid object construction costs
> > >>
> > >> msgpack-java (https://github.com/msgpack/msgpack-java) is also using
> > >> similar techniques, so these optimizations are totally make sense.
> > >>
> > >> But I still cannot find where SIMD operations will work effectively.
> > >> Using JNI to invoke SIMD ops will have non-negligible overhead. So my
> > guess
> > >> is Arrow will rely on some JVM optimization.
> > >>
> > >> Could you tell me how SIMD operations will be used in Arrow java?
> > >> Thanks in advance,
> > >> --
> > >> Taro L. Saito
> > >> http://xerial.org/leo
> > >>
> >
>
>
>
> --
> Taro L. Saito
> http://xerial.org/leo
>
-- 
-- 
Cheers,
Leif

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message