hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: Hbase bulk import for objects with the same rowid and different columnids
Date Sat, 09 Jan 2010 17:58:12 GMT
Something is up here.  KVSR uses KeyValue.COMPARATOR which does:

   * Compare KeyValues.  When we compare KeyValues, we only compare the Key
   * portion.  This means two KeyValues with same Key but different Values
are
   * considered the same as far as this Comparator is concerned.
   * Hosts a {@link KeyComparator}.

... where Key in the above is the
key/columnfamily/columnqualifier/timestamp/type combination.

If we're only keeping the last value added, thats odd.  It should be keeping
them all since differing in column makes for a different key.

Can you send us over a sample of the keyvalues that are getting conflated.
 Something is wrong.

Thanks for reporting this.
St.Ack

On Sat, Jan 9, 2010 at 9:09 AM, Ioannis Konstantinou <ikons@cslab.ntua.gr>wrote:

> Hello,
>
> I am trying to bulk upload content to hbase using the instructions provided
> at
> http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description
> :
> I have a mapper that reads input and emmits KeyValue objects to be fed in
> the KeyValueSortReducer. The mapper emmits a number of KeyValue objects for
> each row. For the same rowid, the KeyValue objects have different columnids.
>  The problem is the following: when these KeyValue objects (that have the
> same rowid but different colids in the same column family) reach the
> reducer, the TreeSet used to sort KeyValues, keeps only the KeyValue that
> gets last (it replaces all entries with the last one that reaches the
> reducer), as the KeyValue.COMPARATOR compares only the rowid !!!!!
>
> Can I use a different Comparator??? KeyValue objects of the same rowid must
> be sorted before writing them in the Hfile, or this does not matter???
>
> Thank you in advance for your time.
>
>
> --
> Ioannis Konstantinou
> Research Associate, Computing Systems Laboratory
> National Technical University of Athens
> Web: http://www.cslab.ntua.gr/~ikons
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message