hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Stepachev <oct...@gmail.com>
Subject Re: OT - Hash Code Creation
Date Wed, 16 Mar 2011 22:48:19 GMT
Try hash table with double hashing.
Something like this

2011/3/17 Peter Haidinyak <phaidinyak@local.com>

> Hi,
>        This is a little off topic but this group seems pretty swift so I
> thought I would ask. I am aggregating a day's worth of log data which means
> I have a Map of over 24 million elements. What would be a good algorithm to
> use for generating Hash Codes for these elements that cut down on
> collisions? I application starts out reading in a log (144 logs in all) in
> about 20 seconds and by the time I reach the last log it is taking around
> 120 seconds. The extra 100 seconds have to do with Hash Table Collisions.
> I've played around with different Hashing algorithms and cut the original
> time from over 300 seconds to 120 but I know I can do better.
> The key I am using for the Map is an alpha-numeric string that is
> approximately 16 character long with the last 4 or 5 character being the
> most unique.
> Any ideas?
> Thanks
> -Pete

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message