cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Huiliang Zhang <zhl...@gmail.com>
Subject Re: cqlinputformat and retired cqlpagingingputformat creates lots of connections to query the server
Date Wed, 28 Jan 2015 07:21:59 GMT
In that case, each node will have 256/3 connections at most. Still 256
mappers. Someone please correct me if I am wrong.

On Tue, Jan 27, 2015 at 11:04 PM, Shenghua(Daniel) Wan <
wanshenghua@gmail.com> wrote:

> Hi, Huiliang,
> Great to hear from you, again!
> Image you have 3 nodes, replication factor=1, and using default number of
> tokens. You will have 3*256 mappers... In that case, you will be soon out
> of mappers or reach the limit.
>
>
> On Tue, Jan 27, 2015 at 10:59 PM, Huiliang Zhang <zhlntu@gmail.com> wrote:
>
>> Hi Shenghua, as I understand, each range is assigned to a mapper. Mapper
>> will not share connections. So, it needs at least 256 connections to read
>> all. But all 256 connections should not be set up at the same time unless
>> you have 256 mappers running at the same time.
>>
>> On Tue, Jan 27, 2015 at 9:34 PM, Shenghua(Daniel) Wan <
>> wanshenghua@gmail.com> wrote:
>>
>>> By default, each C* node is set with 256 tokens. On a local 1-node C*
>>> server, my hadoop drop creates 256 connections to the server. Is there any
>>> way to control this behavior? e.g. reduce the number of connections to a
>>> pre-configured gap.
>>>
>>> I debugged C* source code and found the client asks for partition
>>> ranges, or virtual nodes. Then the client was told by server there were 257
>>> ranges, corresponding to 257 column family splits.
>>>
>>> Here is a snapshot of my logs
>>>
>>> 15/01/27 18:02:20 DEBUG hadoop.AbstractColumnFamilyInputFormat: adding
>>> ColumnFamilySplit((9121856086738887846, '-9223372036854775808] @[localhost])
>>> ...
>>> totally 257 splits.
>>>
>>> The problem is the user might only want all the data via a "select *"
>>> like statement. It seems that 257 connections to query the rows are
>>> necessary. However, is there any way to prohibit 257 concurrent
>>> connections?
>>>
>>> My C* version is 2.0.11 and I also tried CqlPagingInputFormat, which has
>>> same behavior.
>>>
>>> Thank you.
>>>
>>> --
>>>
>>> Regards,
>>> Shenghua (Daniel) Wan
>>>
>>
>>
>
>
> --
>
> Regards,
> Shenghua (Daniel) Wan
>

Mime
View raw message