Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
Sender: Christopher Tarnas <cft@tarnas.org>
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Apple Message framework v1084)
Subject: Re: Row-key design (was: GZ better than LZO?)
From: Chris Tarnas <cft@email.com>
In-Reply-To: <84B5E4309B3B9F4ABFF7664C3CD7698302D0DCCB@kairo.scch.at>
Date: Mon, 1 Aug 2011 09:36:53 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <3B11DE60-6D3E-40BF-98B3-FC5B49F20313@email.com>
References: <84B5E4309B3B9F4ABFF7664C3CD7698302D0DCCB@kairo.scch.at>
To: user@hbase.apache.org

Glad to be able to help. You don't need fixed width fields to do range =
scans, you just need delimiters that are lexicographically less than any =
valid character in your fields. If your fields are are printable =
non-whitespace characters then tab, ASCII 0x09, works very well. That =
will guarantee correct overall sorting. For example if vehicle_id is =
your first field and device_id is your second field:=20

1\t1
1\t2
10\t1
10\t2
2\t1
2\t2
.
.

You can then do prefix scans on for a particular vehicle_id, just be =
sure to append the delimiter to the vehicle_id and using that as your =
prefix.

I have also used the null character (ASCII 0x00) as a delimiter as well =
as combined the null delimiters and fixed width binary fields for more =
complex index type keys.

-chris


On Jul 31, 2011, at 11:48 PM, Steinmaurer Thomas wrote:

> Hello Chris,
>=20
> thanks a lot for your insights. Much appreciated!
>=20
> In our test there were 1000 differents vehicle_ids, inserted via our
> multi-threaded client, so while sequential integers (basically the
> iterator value of the for loop starting the threads) not strictly
> inserted in that order.
>=20
> Regarding padding. I thought that we need some sort of fixed width =
stuff
> per "element" (what's the correct term here) in the row key, to enable
> the possibility to do range scans. Our growing factor in the system is =
a
> growing number of vehicles which needs to be supported. While we have
> the vehicle_id at the beginning of the rowkey, you mean, moving the
> vehicle_id value to the left will give better distribution? We still
> need to have range scans though.
>=20
> In real life, the vehicle in the master data might not be uniquely
> identified by an integer, but an alphanumeric serial number, so I =
guess
> this will make a difference then and should be included in our tests
> compared to sequential integers as part of the row key.
>=20
> Still. For range scans, I thought we need some sort of fixed width row
> keys, thus padding the row key data with "0".
>=20


> Thanks!
>=20
> Thomas
>=20
> -----Original Message-----
> From: Christopher Tarnas [mailto:cft@tarnas.org] On Behalf Of Chris
> Tarnas
> Sent: Freitag, 29. Juli 2011 18:49
> To: user@hbase.apache.org
> Subject: Re: GZ better than LZO?
>=20
> Your region distribution across the nodes is not great, for both cases
> most of your data is going to one server, spreading the regions out
> across multiple servers would be best.
>=20
> How many different vehicle_ids are being used, and are they all
> sequential integers in your tests? Hbase performs better when not =
doing
> sequential inserts. You could try reversing the vehicle ids to get
> around that (see the many discussions on the list about using reverse
> timestamps as a rowkey)
>=20
> Looking at your key construction I would suggest, unless your app
> requires it, to not left-pad  your ids with zeros and rather use a
> delimiter between the key components. That will lead to smaller keys, =
if
> you use a tab as your delimiter that character falls before all other
> alphanumeric and punctuation characters (other than LF, CR, etc -
> characters that should not be in your IDs) so the keys will sort the
> same and left padded ones.=20
>=20
> I've had good luck with converting sequential numeric IDs to base 64 =
and
> then reversing them - that leads to very good key distribution across
> regions and shorter keys for any given number. Another option - if you
> don't care if your rowkeys are plaintext, is to convert the IDs to
> binary numbers and then reverse the bytes - that would be the most
> compact. If you do that you would go back to not using delimiters and
> just have fixed offsets for each component.
>=20
> Once you have a rowkey design you can then go ahead and create your
> tables pre-split with multiple empty regions. That should perform much
> better over all for inserts, especially when the DB is new and empty =
to
> start.
>=20
> How did the load with 4 million records perform?
>=20
> -chris
>=20
> On Jul 29, 2011, at 12:36 AM, Steinmaurer Thomas wrote:
>=20
>> Hi Chris!
>>=20
>> Your questions are somehow hard to answer for me, because I'm not=20
>> really in charge for the test cluster from an administration/setup
> POV.
>>=20
>> Basically, when running:
>> http://xxx:60010/master.jsp
>>=20
>> I see 7 region servers. Each with a "maxHeap" value of 995.
>>=20
>> When clicking on the different tables depending on the compression=20
>> type, I get the following information:
>>=20
>> GZ compressed table: 3 regions hosted by one region server LZO=20
>> compressed table: 8 regions hosted by two region servers, where the=20=

>> start region is hosted by one region server and all other 7 regions=20=

>> are hosted on the second region server
>>=20
>> Regarding the insert pattern etc... please have a look on my reply to=20=

>> Chiku, where I describe the test data generator and the table layout=20=

>> etc ... a bit.
>>=20
>> Thanks,
>> Thomas
>>=20
>> -----Original Message-----
>> From: Christopher Tarnas [mailto:cft@tarnas.org] On Behalf Of Chris=20=

>> Tarnas
>> Sent: Donnerstag, 28. Juli 2011 19:43
>> To: user@hbase.apache.org
>> Subject: Re: GZ better than LZO?
>>=20
>> During the load did you add enough data to do a flush or compaction?=20=

>> P, In our cluster that amount of data inserted would not necessarily=20=

>> be enough to actually flush store files. Performance really depends =
on
>=20
>> how the table's regions are laid out, the insert pattern, the number=20=

>> of regionservers and the amount of RAM allocated to each =
regionserver.
>=20
>> If you don't see any flushes or compactions in the log try repeating=20=

>> that test and then flushing the table and do a compaction (or add =
more
>=20
>> data so it happens automatically) and timing everything. It would be=20=

>> interesting to see if the GZ benefit holds up.
>>=20
>> -chris
>>=20
>> On Jul 28, 2011, at 6:31 AM, Steinmaurer Thomas wrote:
>>=20
>>> Hello,
>>>=20
>>>=20
>>>=20
>>> we ran a test client generating data into GZ and LZO compressed
> table.
>>> Equal data sets (number of rows: 1008000 and the same table schema).=20=

>>> ~
>>> 7.78 GB disk space uncompressed in HDFS. LZO is ~ 887 MB whereas GZ=20=

>>> is
>>=20
>>> ~
>>> 444 MB, so basically half of LZO.
>>>=20
>>>=20
>>>=20
>>> Execution time of the data generating client was 1373 seconds into=20=

>>> the
>>=20
>>> uncompressed table, 3374 sec. into LZO and 2198 sec. into GZ. The=20
>>> data
>>=20
>>> generation client is based on HTablePool and using batch operations.
>>>=20
>>>=20
>>>=20
>>> So in our (simple) test, GZ beats LZO in both, disk usage and=20
>>> execution time of the client. We haven't tried reads yet.
>>>=20
>>>=20
>>>=20
>>> Is this an expected result? I thought LZO is the recommended=20
>>> compression algorithm? Or does LZO outperforms GZ with a growing=20
>>> amount of data or in read scenarios?
>>>=20
>>>=20
>>>=20
>>> Regards,
>>>=20
>>> Thomas
>>>=20
>>>=20
>>>=20
>>=20
>=20