hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Schäfer <syrious3...@yahoo.de>
Subject Re: Persist to HBase with JPA using HBql-JDBC-Driver (Examples)?
Date Mon, 24 Oct 2011 21:03:14 GMT
Hi Frédéric,

Thanks for you info ...and sorry for the late answer...

I'm currently waiting for a test system...could still take a few weeks.. :-(

When its up and running I will start to compare at first compare hbase raw api vs. data nucleus-orm
on different db-sizes (20GB - 100GB)

If datanucleus-orm is much slower than hbase-api I will test the other ones you mentioned

Should someone be interested in the results I willl post them here ..

Somehow I wonder that there hasn't been such benchmarks yet....or have I just missed them?


Von: Frédéric Fondement <frederic.fondement@uha.fr>
An: user@hbase.apache.org
Gesendet: 10:24 Dienstag, 18.Oktober 2011 
Betreff: Re: Persist to HBase with JPA using HBql-JDBC-Driver (Examples)?

Le 15/10/11 23:34, Christian Schäfer a écrit :
> But nevertheless I will try on using data nucleus' jpa for hbase and make some benchmarks
to compare it with the hbase native interface;-)
Hi there,

Would be great if you plan to make such study to publish results (here ?) !!!

What about proposing a simple application that all those guys who created an ORM (Datanucleus,
Kundera, ...) could implement and submit (you?) for a bench ?

I'm part of those guys. We created n-orm (http://code.google.com/p/n-orm/) just as a matter
to separate responsibilities in our team (functionnal vs non-functionnal), to centralize data
management (to improve separation of concerns, and thus maintainability), and to still understand
what really happens under the hood (and still be able to change platform in case of problem...).
Actually, our ORM considers POJOs as some kind of schema for the base (query-driven), and
thus, philosophy is more to use java objects but with the knowledge of how to use HBase in
mind, so that we hope not loosing too much of HBase possibilities.

I agree when Michel says that the HBase API is easy, but when it comes to details, it's really
hard to think of everything, especially when it's interleaved with functionnal code (scan
caching, inter-process schema management, compression, migration, error handling, new versions
of the API, new possibilities... or just learning a new important stuff to be integrated in
the complete application !).

Nevertheless, as our application becomes more and more complex, it's unconceivable for us
to re-implement it just using the HBase raw API. But, as a consequence, I have no real idea
of the price we pay regarding performance just to help us developing...

Another ORM that deserves attention is https://github.com/ghelmling/meetup.beeno which is
built on the same philosophy. Actually, we didn't choose it as it's too tightly coupled with
HBase, but I guess it must really perform well (because of the latter reason).

I think the real danger of ORMs is to think your schema in a domain-driven (classical) fashion,
instead of query-driven. It might be the case that this danger is less important when you
use raw APIs.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message