lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dawid Weiss <>
Subject Re: HPPC: High Performance Primitive Collections for Java
Date Mon, 19 Apr 2010 20:11:21 GMT
> Hmmm.. can anybody compare these to fastutil?

I believe I can answer some of your questions.

1) HPPC is not directly Java Collections-compatible. It does have
interface hierarchy, but it's not a descendant of the familiar Set,
Map or List. Fastutil is collections-compatible.

2) HPPC has open internals, so you can do anything you like once your
collections are created, including manipulation of internal storage
arrays, for instance. This was a design decision and goal. As with any
sharp objects, improper use may cause harm.

3) HPPC uses assert instead of fixed condition checks. There are no
attempts to detect misuse (fail-fast iterators, etc.).

4) fastutil is more mature, has support for more data structures
(sorted trees, etc.) and was written by an excellent programmer
(Sebastiano Vigna). HPPC was created internally for use at Carrot
Search and was primarily motivated by speed; we believed that in
certain applications direct access to collections' internals should be
allowed and should be beneficial. Our micro-benchmarks show that this
is largerly true if you manipulate LOTS of data. For smaller data sets
even built-in Java collections with boxed types do surprisingly well
(due to HotSpot optimizations too).

5) There are subtle differences in how HPPC is written -- I use pretty
much normal generic classes with some pseudo-intrinsics and
regexp-substituted comments. Sebastiano uses C++ preprocessor to
generate Java classes from templates (yes, wicked).

I look at Lucene and SOLR source code and learn a LOT from folks
contributing to this project, so HPPC will be hardly any faster or
better compared to what Lucene already has, but if anybody find
anything from HPPC useful, please take handfuls. I would love for this
project to be finally merged with Mahout, but I intentially left it in
Carrot Search labs for a little while so that the API can stabilize
(through our in-house experiments mostly).

Thanks for showing your interest!

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message