cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Yaskevich (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-4305) CF serialization failure when working with custom secondary indices.
Date Wed, 06 Jun 2012 11:22:23 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290082#comment-13290082
] 

Pavel Yaskevich edited comment on CASSANDRA-4305 at 6/6/12 11:21 AM:
---------------------------------------------------------------------

bq. Granted, it's an implementation detail, but the more I think about it, the more I think
it's one index implementers should be aware of and which makes sense if you think about it:
if you mutate a RM post-apply, you're either going to duplicate part of the mutation as Sylvain
says, or index data that didn't make it to the commitlog, either of which is Bad.

I disagree, after RM is send to the processing it's state should be persisted, there is no
reason not to allow user to mutate his own objects e.g. when one want to populate CF with
missing columns to create whole document (to store it into external component like Solr) with
all indexed columns (old + new) so instead of copying everything (cf + columns + old columns),
just missing columns could be fetched from the DB and added to the existing CF while secondary
indexes are processed. Otherwise removing the need to make one copy in CL that would instead
make multiple copies while secondary indices are processed.
                
      was (Author: xedin):
    bq. Granted, it's an implementation detail, but the more I think about it, the more I
think it's one index implementers should be aware of and which makes sense if you think about
it: if you mutate a RM post-apply, you're either going to duplicate part of the mutation as
Sylvain says, or index data that didn't make it to the commitlog, either of which is Bad.

I disagree, after RM is send to the processing it's state should be persisted, there is no
reason not to allow user to mutate his own objects e.g. when one want to populate CF with
missing columns to create whole document with all indexed columns (old + new) so instead of
copying everything (cf + columns + old columns), just missing columns could be fetched from
the DB and added to the existing CF while secondary indexes are processed. Otherwise removing
the need to make one copy in CL that would instead make multiple copies while secondary indices
are processed.
                  
> CF serialization failure when working with custom secondary indices.
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-4305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.10
>            Reporter: Pavel Yaskevich
>              Labels: datastax_qa
>         Attachments: CASSANDRA-4305.patch
>
>
> Assertion (below) was triggered when client was adding new rows to Solr-backed secondary
indices (1000-row batch without any timeout).
> {noformat}
> ERROR [COMMIT-LOG-WRITER] 2012-05-30 16:39:02,896 AbstractCassandraDaemon.java (line
139) Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
> java.lang.AssertionError: Final buffer length 176 to accomodate data size of 123 (predicted
87) for RowMutation(keyspace='solrTest1338395932411', key='6b6579383039', modifications=[ColumnFamily(cf1
[long:false:8@1338395942384024,stringId:false:13@1338395940586003,])])
>         at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
>         at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
>         at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:600)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> After investigation it was clear that it was happening because we were holding instances
of RowMutation queued to the addition to CommitLog to the actual "write" moment which is redundant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message