cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matope Ono <matope....@gmail.com>
Subject Re: Why does Cassandra need to have 2B column limit? why can't we have unlimited ?
Date Sat, 15 Oct 2016 15:13:35 GMT
Thank you DuyHai.
I was in two minds about large partitions for my app.
I thought upgrading to 3.x would be good and easy option. But now I'm going
to work on refactoring my data model :)

2016-10-15 20:38 GMT+09:00 DuyHai Doan <doanduyhai@gmail.com>:

> Yes, more or less. The 100Mb is a rule of thumb. No one will blame you for
> storing 200Mb for example. The figure is just given as an example of order
> of magnitude
>
> On Sat, Oct 15, 2016 at 1:37 PM, Kant Kodali <kant@peernova.com> wrote:
>
>> you mean 100MB (MegaBytes)? Also the data in each of my column is about
>> 1KB so in that case the optimal size 100K columns (since 100K * 1KB =
>> 100MB) right?
>>
>> On Sat, Oct 15, 2016 at 4:26 AM, DuyHai Doan <doanduyhai@gmail.com>
>> wrote:
>>
>>> "2) so what is optimal limit in terms of data size?"
>>>
>>> --> Usual recommendations for Cassandra 2.1 are:
>>>
>>> a. max 100Mb per partition size
>>> b. or up to 10 000 000 physical columns for a partition (including
>>> clustering columns etc ...)
>>>
>>> Recently, with the work of Robert Stupp (CASSANDRA-11206) and also with
>>> the huge enhancement from Michael Kjellman (CASSANDRA-9754) it will be
>>> easier to handle huge partition in memory, especially with a reduce memory
>>> footprint with regards to the JVM heap.
>>>
>>> However, as long as we don't have repair and streaming processes that
>>> can be "resumed" in a middle of a partition, the operational pains will
>>> still be there. Same for compaction
>>>
>>>
>>>
>>> On Sat, Oct 15, 2016 at 12:00 PM, Kant Kodali <kant@peernova.com> wrote:
>>>
>>>> 1) It will be great if someone can confirm that there is no limit
>>>> 2) so what is optimal limit in terms of data size?
>>>>
>>>> Finally, Thanks a lot for pointing out all the operational issues!
>>>>
>>>> On Sat, Oct 15, 2016 at 2:39 AM, DuyHai Doan <doanduyhai@gmail.com>
>>>> wrote:
>>>>
>>>>> "But is there still 2B columns limit on the Cassandra code?"
>>>>>
>>>>> --> I remember some one the committer saying that this 2B columns
>>>>> limitation comes from the Thrift era where you're limited to max  2B
>>>>> columns to be returned to the client for each request. It also applies
to
>>>>> the max size of each "page" of data
>>>>>
>>>>> Since the introduction of the binary protocol and the paging feature,
>>>>> this limitation does not make sense anymore.
>>>>>
>>>>> By the way, if your partition is too wide, you'll face other
>>>>> operational issues way before reaching the 2B columns limit:
>>>>>
>>>>> - compaction taking looooong time --> heap pressure --> long GC
pauses
>>>>> --> nodes flapping
>>>>> - repair & over-streaming, repair session failure in the middle that
>>>>> forces you to re-send the whole big partition --> the receiving node
has a
>>>>> bunch of duplicate data --> pressure on compaction
>>>>> - bootstrapping of new nodes --> failure to stream a partition in
the
>>>>> middle will force to re-send the whole partition from the beginning again
-->
>>>>> the receiving node has a bunch of duplicate data --> pressure on compaction
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Oct 15, 2016 at 9:15 AM, Kant Kodali <kant@peernova.com>
>>>>> wrote:
>>>>>
>>>>>>  compacting 10 sstables each of them have a 15GB partition in what
>>>>>> duration?
>>>>>>
>>>>>> On Fri, Oct 14, 2016 at 11:45 PM, Matope Ono <matope.ono@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Please forget the part in my sentence.
>>>>>>> For more correctly, maybe I should have said like "He could compact
>>>>>>> 10 sstables each of them have a 15GB partition".
>>>>>>> What I wanted to say is we can store much more rows(and columns)
in
>>>>>>> a partition than before 3.6.
>>>>>>>
>>>>>>> 2016-10-15 15:34 GMT+09:00 Kant Kodali <kant@peernova.com>:
>>>>>>>
>>>>>>>> "Robert said he could treat safely 10 15GB partitions at
his
>>>>>>>> presentation" This sounds like there is there is a row limit
too
>>>>>>>> not only columns??
>>>>>>>>
>>>>>>>> If I am reading this correctly 10 15GB partitions  means
10
>>>>>>>> partitions (like 10 row keys,  thats too small) with each
partition of size
>>>>>>>> 15GB. (thats like 15 million columns where each column can
have a data of
>>>>>>>> size 1KB).
>>>>>>>>
>>>>>>>> On Fri, Oct 14, 2016 at 11:30 PM, Kant Kodali <kant@peernova.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> "Robert said he could treat safely 10 15GB partitions
at his
>>>>>>>>> presentation" This sounds like there is there is a row
limit too
>>>>>>>>> not only columns??
>>>>>>>>>
>>>>>>>>> If I am reading this correctly 10 15GB partitions  means
10
>>>>>>>>> partitions (like 10 row keys,  thats too small) with
each partition of size
>>>>>>>>> 15GB. (thats like 10 million columns where each column
can have a data of
>>>>>>>>> size 1KB).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Oct 14, 2016 at 9:54 PM, Matope Ono <matope.ono@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks to CASSANDRA-11206, I think we can have much
larger
>>>>>>>>>> partition than before 3.6.
>>>>>>>>>> (Robert said he could treat safely 10 15GB partitions
at his
>>>>>>>>>> presentation. https://www.youtube.com/watch?v=N3mGxgnUiRY)
>>>>>>>>>>
>>>>>>>>>> But is there still 2B columns limit on the Cassandra
code?
>>>>>>>>>> If so, out of curiosity, I'd like to know where the
bottleneck
>>>>>>>>>> is. Could anyone let me know about it?
>>>>>>>>>>
>>>>>>>>>> Thanks Yasuharu.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2016-10-13 1:11 GMT+09:00 Edward Capriolo <edlinuxguru@gmail.com>
>>>>>>>>>> :
>>>>>>>>>>
>>>>>>>>>>> The "2 billion column limit" press clipping "puffery".
This
>>>>>>>>>>> statement seemingly became popular because highly
traffic traffic-ed story,
>>>>>>>>>>> in which a tech reporter embellished on a statement
to make a splashy
>>>>>>>>>>> article.
>>>>>>>>>>>
>>>>>>>>>>> The effect is something like this:
>>>>>>>>>>> http://www.healthnewsreview.org/2012/08/iced-tea-kidney-ston
>>>>>>>>>>> es-and-the-study-that-never-existed/
>>>>>>>>>>>
>>>>>>>>>>> Iced tea does not cause kidney stones! Cassandra
does not store
>>>>>>>>>>> rows with 2 billion columns! It is just not true.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Oct 12, 2016 at 4:57 AM, Kant Kodali
<kant@peernova.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Well 1) I have not sent it to postgresql
mailing lists 2) I
>>>>>>>>>>>> thought this is an open ended question as
it can involve ideas from
>>>>>>>>>>>> everywhere including the Cassandra java driver
mailing lists so sorry If
>>>>>>>>>>>> that bothered you for some reason.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Oct 12, 2016 at 1:41 AM, Dorian Hoxha
<
>>>>>>>>>>>> dorian.hoxha@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Also, I'm not sure, but I don't think
it's "cool" to write to
>>>>>>>>>>>>> multiple lists in the same message. (based
on postgresql mailing lists
>>>>>>>>>>>>> rules).
>>>>>>>>>>>>> Example I'm not subscribed to those,
and now the messages are
>>>>>>>>>>>>> separated.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Oct 12, 2016 at 10:37 AM, Dorian
Hoxha <
>>>>>>>>>>>>> dorian.hoxha@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> There are some issues working on
larger partitions.
>>>>>>>>>>>>>> Hbase doesn't do what you say! You
have also to be carefull
>>>>>>>>>>>>>> on hbase not to create large rows!
But since they are globally-sorted, you
>>>>>>>>>>>>>> can easily sort between them and
create small rows.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In my opinion, cassandra people are
wrong, in that they say
>>>>>>>>>>>>>> "globally sorted is the devil!" while
all fb/google/etc actually use
>>>>>>>>>>>>>> globally-sorted most of the time!
You have to be careful though (just like
>>>>>>>>>>>>>> with random partition)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can you tell what rowkey1, page1,
col(x) actually are ? Maybe
>>>>>>>>>>>>>> there is a way.
>>>>>>>>>>>>>> The most "recent", means there's
a timestamp in there ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Oct 12, 2016 at 9:58 AM,
Kant Kodali <
>>>>>>>>>>>>>> kant@peernova.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I understand Cassandra can have
a maximum of 2B rows per
>>>>>>>>>>>>>>> partition but in practice some
people seem to suggest the magic number is
>>>>>>>>>>>>>>> 100K. why not create another
partition/rowkey automatically (whenever we
>>>>>>>>>>>>>>> reach a safe limit that  we consider
would be efficient)  with auto
>>>>>>>>>>>>>>> increment bigint  as a suffix
appended to the new rowkey? so that the
>>>>>>>>>>>>>>> driver can return the new rowkey
 indicating that there is a new partition
>>>>>>>>>>>>>>> and so on...Now I understand
this would involve allowing partial row key
>>>>>>>>>>>>>>> searches which currently Cassandra
wouldn't do (but I believe HBASE does)
>>>>>>>>>>>>>>> and thinking about token ranges
and potentially many other things..
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> My current problem is this
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have a row key followed by
bunch of columns (this is not
>>>>>>>>>>>>>>> time series data)
>>>>>>>>>>>>>>> and these columns can grow to
any number so since I have
>>>>>>>>>>>>>>> 100K limit (or whatever the number
is. say some limit) I want to break the
>>>>>>>>>>>>>>> partition into level/pages
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> rowkey1, page1->col1, col2,
col3......
>>>>>>>>>>>>>>> rowkey1, page2->col1, col2,
col3......
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> now say my Cassandra db is populated
with data and say my
>>>>>>>>>>>>>>> application just got booted up
and I want to most recent value of a certain
>>>>>>>>>>>>>>> partition but I don't know which
page it belongs to since my application
>>>>>>>>>>>>>>> just got booted up? how do I
solve this in the most efficient that is
>>>>>>>>>>>>>>> possible in Cassandra today?
I understand I can create MV, other tables
>>>>>>>>>>>>>>> that can hold some auxiliary
data such as number of pages per partition and
>>>>>>>>>>>>>>> so on..but that involves the
maintenance cost of that other table which I
>>>>>>>>>>>>>>> cannot afford really because
I have MV's, secondary indexes for other good
>>>>>>>>>>>>>>> reasons. so it would be great
if someone can explain the best way possible
>>>>>>>>>>>>>>> as of today with Cassandra? By
best way I mean is it possible with one
>>>>>>>>>>>>>>> request? If Yes, then how? If
not, then what is the next best way to solve
>>>>>>>>>>>>>>> this?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> kant
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message