incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom <fivemile...@gmail.com>
Subject Re: Why data is not even distributed.
Date Thu, 04 Oct 2012 07:33:17 GMT
Hi Andrey,

while the data values you generated might be following a true random
distribution, your row key, UUID, is not (because it is created on the same
machines by the same software with a certain window of time)

For example, if you were using the UUID class in Java, these would be
composed from several components (related to dimensions such as time and
version), so you can not expect a random distribution over the whole space.


Cheers
Tom



On Wed, Oct 3, 2012 at 5:39 PM, Andrey Ilinykh <ailinykh@gmail.com> wrote:

> Hello, everybody!
>
> I'm observing very strange behavior. I have 3 node cluster with
> ByteOrderPartitioner. (I run 1.1.5)
> I created a key space with replication factor of 1.
> Then I created one column family and populated it with random data.
> I use UUID as a row key, and Integer as a column name.
> Row keys were generated as
>
> UUID uuid = UUID.randomUUID();
>
> I populated about 100000 rows with 100 column each.
>
> I would expect equal load on each node, but the result is totally
> different. This is what nodetool gives me:
>
> Address         DC          Rack        Status State   Load
> Effective-Ownership Token
>
>
> Token(bytes[56713727820156410577229101238628035242])
> 127.0.0.1       datacenter1 rack1       Up     Normal  27.61 MB
> 33.33%              Token(bytes[00])
> 127.0.0.3       datacenter1 rack1       Up     Normal  206.47 KB
> 33.33%
> Token(bytes[0113427455640312821154458202477256070485])
> 127.0.0.2       datacenter1 rack1       Up     Normal  13.86 MB
> 33.33%
> Token(bytes[56713727820156410577229101238628035242])
>
>
> one node (127.0.0.3) is almost empty.
> Any ideas what is wrong?
>
>
> Thank you,
>   Andrey
>

Mime
View raw message