cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kant Kodali <k...@peernova.com>
Subject Re: Why does Cassandra need to have 2B column limit? why can't we have unlimited ?
Date Sat, 15 Oct 2016 10:00:04 GMT
1) It will be great if someone can confirm that there is no limit
2) so what is optimal limit in terms of data size?

Finally, Thanks a lot for pointing out all the operational issues!

On Sat, Oct 15, 2016 at 2:39 AM, DuyHai Doan <doanduyhai@gmail.com> wrote:

> "But is there still 2B columns limit on the Cassandra code?"
>
> --> I remember some one the committer saying that this 2B columns
> limitation comes from the Thrift era where you're limited to max  2B
> columns to be returned to the client for each request. It also applies to
> the max size of each "page" of data
>
> Since the introduction of the binary protocol and the paging feature, this
> limitation does not make sense anymore.
>
> By the way, if your partition is too wide, you'll face other operational
> issues way before reaching the 2B columns limit:
>
> - compaction taking looooong time --> heap pressure --> long GC pauses -->
> nodes flapping
> - repair & over-streaming, repair session failure in the middle that
> forces you to re-send the whole big partition --> the receiving node has a
> bunch of duplicate data --> pressure on compaction
> - bootstrapping of new nodes --> failure to stream a partition in the
> middle will force to re-send the whole partition from the beginning again -->
> the receiving node has a bunch of duplicate data --> pressure on compaction
>
>
>
> On Sat, Oct 15, 2016 at 9:15 AM, Kant Kodali <kant@peernova.com> wrote:
>
>>  compacting 10 sstables each of them have a 15GB partition in what
>> duration?
>>
>> On Fri, Oct 14, 2016 at 11:45 PM, Matope Ono <matope.ono@gmail.com>
>> wrote:
>>
>>> Please forget the part in my sentence.
>>> For more correctly, maybe I should have said like "He could compact 10
>>> sstables each of them have a 15GB partition".
>>> What I wanted to say is we can store much more rows(and columns) in a
>>> partition than before 3.6.
>>>
>>> 2016-10-15 15:34 GMT+09:00 Kant Kodali <kant@peernova.com>:
>>>
>>>> "Robert said he could treat safely 10 15GB partitions at his
>>>> presentation" This sounds like there is there is a row limit too not
>>>> only columns??
>>>>
>>>> If I am reading this correctly 10 15GB partitions  means 10 partitions
>>>> (like 10 row keys,  thats too small) with each partition of size 15GB.
>>>> (thats like 15 million columns where each column can have a data of size
>>>> 1KB).
>>>>
>>>> On Fri, Oct 14, 2016 at 11:30 PM, Kant Kodali <kant@peernova.com>
>>>> wrote:
>>>>
>>>>> "Robert said he could treat safely 10 15GB partitions at his
>>>>> presentation" This sounds like there is there is a row limit too not
>>>>> only columns??
>>>>>
>>>>> If I am reading this correctly 10 15GB partitions  means 10 partitions
>>>>> (like 10 row keys,  thats too small) with each partition of size 15GB.
>>>>> (thats like 10 million columns where each column can have a data of size
>>>>> 1KB).
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Oct 14, 2016 at 9:54 PM, Matope Ono <matope.ono@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks to CASSANDRA-11206, I think we can have much larger partition
>>>>>> than before 3.6.
>>>>>> (Robert said he could treat safely 10 15GB partitions at his
>>>>>> presentation. https://www.youtube.com/watch?v=N3mGxgnUiRY)
>>>>>>
>>>>>> But is there still 2B columns limit on the Cassandra code?
>>>>>> If so, out of curiosity, I'd like to know where the bottleneck is.
>>>>>> Could anyone let me know about it?
>>>>>>
>>>>>> Thanks Yasuharu.
>>>>>>
>>>>>>
>>>>>> 2016-10-13 1:11 GMT+09:00 Edward Capriolo <edlinuxguru@gmail.com>:
>>>>>>
>>>>>>> The "2 billion column limit" press clipping "puffery". This
>>>>>>> statement seemingly became popular because highly traffic traffic-ed
story,
>>>>>>> in which a tech reporter embellished on a statement to make a
splashy
>>>>>>> article.
>>>>>>>
>>>>>>> The effect is something like this:
>>>>>>> http://www.healthnewsreview.org/2012/08/iced-tea-kidney-ston
>>>>>>> es-and-the-study-that-never-existed/
>>>>>>>
>>>>>>> Iced tea does not cause kidney stones! Cassandra does not store
rows
>>>>>>> with 2 billion columns! It is just not true.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 12, 2016 at 4:57 AM, Kant Kodali <kant@peernova.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Well 1) I have not sent it to postgresql mailing lists 2)
I thought
>>>>>>>> this is an open ended question as it can involve ideas from
everywhere
>>>>>>>> including the Cassandra java driver mailing lists so sorry
If that bothered
>>>>>>>> you for some reason.
>>>>>>>>
>>>>>>>> On Wed, Oct 12, 2016 at 1:41 AM, Dorian Hoxha <
>>>>>>>> dorian.hoxha@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Also, I'm not sure, but I don't think it's "cool" to
write to
>>>>>>>>> multiple lists in the same message. (based on postgresql
mailing lists
>>>>>>>>> rules).
>>>>>>>>> Example I'm not subscribed to those, and now the messages
are
>>>>>>>>> separated.
>>>>>>>>>
>>>>>>>>> On Wed, Oct 12, 2016 at 10:37 AM, Dorian Hoxha <
>>>>>>>>> dorian.hoxha@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> There are some issues working on larger partitions.
>>>>>>>>>> Hbase doesn't do what you say! You have also to be
carefull on
>>>>>>>>>> hbase not to create large rows! But since they are
globally-sorted, you can
>>>>>>>>>> easily sort between them and create small rows.
>>>>>>>>>>
>>>>>>>>>> In my opinion, cassandra people are wrong, in that
they say
>>>>>>>>>> "globally sorted is the devil!" while all fb/google/etc
actually use
>>>>>>>>>> globally-sorted most of the time! You have to be
careful though (just like
>>>>>>>>>> with random partition)
>>>>>>>>>>
>>>>>>>>>> Can you tell what rowkey1, page1, col(x) actually
are ? Maybe
>>>>>>>>>> there is a way.
>>>>>>>>>> The most "recent", means there's a timestamp in there
?
>>>>>>>>>>
>>>>>>>>>> On Wed, Oct 12, 2016 at 9:58 AM, Kant Kodali <kant@peernova.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> I understand Cassandra can have a maximum of
2B rows per
>>>>>>>>>>> partition but in practice some people seem to
suggest the magic number is
>>>>>>>>>>> 100K. why not create another partition/rowkey
automatically (whenever we
>>>>>>>>>>> reach a safe limit that  we consider would be
efficient)  with auto
>>>>>>>>>>> increment bigint  as a suffix appended to the
new rowkey? so that the
>>>>>>>>>>> driver can return the new rowkey  indicating
that there is a new partition
>>>>>>>>>>> and so on...Now I understand this would involve
allowing partial row key
>>>>>>>>>>> searches which currently Cassandra wouldn't do
(but I believe HBASE does)
>>>>>>>>>>> and thinking about token ranges and potentially
many other things..
>>>>>>>>>>>
>>>>>>>>>>> My current problem is this
>>>>>>>>>>>
>>>>>>>>>>> I have a row key followed by bunch of columns
(this is not time
>>>>>>>>>>> series data)
>>>>>>>>>>> and these columns can grow to any number so since
I have 100K
>>>>>>>>>>> limit (or whatever the number is. say some limit)
I want to break the
>>>>>>>>>>> partition into level/pages
>>>>>>>>>>>
>>>>>>>>>>> rowkey1, page1->col1, col2, col3......
>>>>>>>>>>> rowkey1, page2->col1, col2, col3......
>>>>>>>>>>>
>>>>>>>>>>> now say my Cassandra db is populated with data
and say my
>>>>>>>>>>> application just got booted up and I want to
most recent value of a certain
>>>>>>>>>>> partition but I don't know which page it belongs
to since my application
>>>>>>>>>>> just got booted up? how do I solve this in the
most efficient that is
>>>>>>>>>>> possible in Cassandra today? I understand I can
create MV, other tables
>>>>>>>>>>> that can hold some auxiliary data such as number
of pages per partition and
>>>>>>>>>>> so on..but that involves the maintenance cost
of that other table which I
>>>>>>>>>>> cannot afford really because I have MV's, secondary
indexes for other good
>>>>>>>>>>> reasons. so it would be great if someone can
explain the best way possible
>>>>>>>>>>> as of today with Cassandra? By best way I mean
is it possible with one
>>>>>>>>>>> request? If Yes, then how? If not, then what
is the next best way to solve
>>>>>>>>>>> this?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> kant
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message