cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bartłomiej Romański (JIRA) <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-6621) STCS fallback is probably not optimal in some scenarios
Date Sun, 26 Jan 2014 16:26:40 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bartłomiej Romański updated CASSANDRA-6621:
-------------------------------------------

    Description: 
The initial discussion started in (closed) CASSANDRA-5371. I've rewritten my last comment
here...

After streaming (e.g. during boostrap) Cassandra places all sstables at L0. At the end of
the process we end up with huge number of sstables at the lowest level. 

Currently, Cassandra falls back to STCS until the number of sstables at L0 reaches the reasonable
level (32 or something).

I'm not sure if falling back to STCS is the best way to handle this particular situation.
I've read the comment in the code and I'm aware why it is a good thing to do if we have to
many sstables at L0 as a result of too many random inserts. We have a lot of sstables, each
of them covers the whole ring, there's simply no better option.

However, after the bootstrap situation looks a bit different. The loaded sstables already
have very small ranges! We just have to tidy up a bit and everything should be OK. STCS ignores
that completely and after a while we have a bit less sstables but each of them covers the
whole ring instead of just a small part. I believe that in that case letting LCS do the job
is a better option that allowing STCS mix everything up before.

Is there a way to disable STCS fallback? I'd like to test that scenario in practice during
our next bootstrap...

Does Cassandra really have to put streamed sstables at L0? The only thing we have to assure
is that sstables at any given level do not overlap. If we stream different regions from different
nodes how can we get any overlaps?



  was:
The initial discussion started in (closed) CASSANDRA-5371. I've rewritten my last comment
here...

After streaming (e.g. during boostrap) Cassandra places all sstables at L0. At the end of
the process we end up with huge number of sstables at the lowest level. 

Currently, Cassandra falls back to STCS until the number of sstables at L0 reaches the reasonable
level (32 or something).

I'm not sure if falling back to STCS is the best way to handle this particular situation.
I've read the comment in the code and I'm aware why it is a good thing to do if we have to
many sstables at L0 as a result of too many random inserts. We have a lot of sstables, each
of them covers the whole ring, there's simply no better option.

However, after the bootstrap situation looks a bit different. The loaded sstables already
have very small ranges! We just have to tidy up a bit and everything should be OK. STCS ignores
that completely and after a while we have a bit less sstables but each of them covers the
whole ring instead of just a small part. I believe that in that case letting LCS do the job
is a better option that allowing STCS mix everything up before.

Is there a way to disable STCS fallback? I'd like to test that scenario in practice during
the our next bootstrap...

Does Cassandra really have to put streamed sstables at L0? The only thing we have to assure
is that sstables at any given level do not overlap. If we stream different regions from different
nodes how can we get any overlaps?




> STCS fallback is probably not optimal in some scenarios
> -------------------------------------------------------
>
>                 Key: CASSANDRA-6621
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6621
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Bartłomiej Romański
>
> The initial discussion started in (closed) CASSANDRA-5371. I've rewritten my last comment
here...
> After streaming (e.g. during boostrap) Cassandra places all sstables at L0. At the end
of the process we end up with huge number of sstables at the lowest level. 
> Currently, Cassandra falls back to STCS until the number of sstables at L0 reaches the
reasonable level (32 or something).
> I'm not sure if falling back to STCS is the best way to handle this particular situation.
I've read the comment in the code and I'm aware why it is a good thing to do if we have to
many sstables at L0 as a result of too many random inserts. We have a lot of sstables, each
of them covers the whole ring, there's simply no better option.
> However, after the bootstrap situation looks a bit different. The loaded sstables already
have very small ranges! We just have to tidy up a bit and everything should be OK. STCS ignores
that completely and after a while we have a bit less sstables but each of them covers the
whole ring instead of just a small part. I believe that in that case letting LCS do the job
is a better option that allowing STCS mix everything up before.
> Is there a way to disable STCS fallback? I'd like to test that scenario in practice during
our next bootstrap...
> Does Cassandra really have to put streamed sstables at L0? The only thing we have to
assure is that sstables at any given level do not overlap. If we stream different regions
from different nodes how can we get any overlaps?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message