hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "nguyenhuynh.mr" <nguyenhuynh...@gmail.com>
Subject Import new content to Hbase?
Date Wed, 06 May 2009 02:26:39 GMT
Hi all!

I have a MR jobs to import contents to HBase. Before importing, I have
to determine the new contents to import (The row key in Hbase is URI).
After import this new contents to HBase.

Assume, I have large content in HBabse (> 1,000,000,000 URIs) and I have
1,000,000 URIs need to import (new + existed in Hbase). How to get new
contents (URIs) to import?

The current solution: I check the existed of the URI in Hbase to get the
new URIs. Some things like:


            RowResult row = hTable.getRow(uri);
            if (row.isEmpty()) {

                // collect the new content (URI)


With this solution, if URIs is large then the time connection to HBase
is large :(

Please suggest for me the good solution. :)


Best regards,


View raw message