Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 11B014C64 for ; Mon, 27 Jun 2011 13:56:14 +0000 (UTC) Received: (qmail 67359 invoked by uid 500); 27 Jun 2011 13:56:13 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 67184 invoked by uid 500); 27 Jun 2011 13:56:13 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 67138 invoked by uid 99); 27 Jun 2011 13:56:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Jun 2011 13:56:12 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Jun 2011 13:56:09 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 9A3F34330A0 for ; Mon, 27 Jun 2011 13:55:48 +0000 (UTC) Date: Mon, 27 Jun 2011 13:55:48 +0000 (UTC) From: "Sylvain Lebresne (JIRA)" To: commits@cassandra.apache.org Message-ID: <1395190898.43818.1309182948628.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1138214103.37004.1308929988452.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (CASSANDRA-2823) NPE during range slices with rowrepairs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CASSANDRA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2823: ---------------------------------------- Attachment: 2823.patch I think the problem is with the call to removeDeleted in resolveSuperset() (which is fairly new). Basically, the code is fine with resolved being null as long as this means that all the versions are null. But the removeDeleted call make it possible to have a null removeDeleted even if the versions are not null, if a row tombstone expires between the time it was returned by the node and the time it is resolved by the coordinator for instance. Attaching patch that skips the maybeScheduleRepair() call if resolved == null since even in that case there is nothing to repair since the tombstone are now expired. > NPE during range slices with rowrepairs > --------------------------------------- > > Key: CASSANDRA-2823 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2823 > Project: Cassandra > Issue Type: Bug > Affects Versions: 0.8.2 > Environment: This is a trunk build with 2521 and 2433 > I somewhat doubt that is related however. > Reporter: Terje Marthinussen > Assignee: Sylvain Lebresne > Attachments: 2823.patch > > > Doing some heavy testing of relatively fast feeding (5000+ mutations/sec) + repair on all node + range slices. > Then occasionally killing a node here and there and restarting it. > Triggers the following NPE > ERROR [pool-2-thread-3] 2011-06-24 20:56:27,289 Cassandra.java (line 3210) Internal error processing get_range_slices > java.lang.NullPointerException > at org.apache.cassandra.service.RowRepairResolver.maybeScheduleRepairs(RowRepairResolver.java:109) > at org.apache.cassandra.service.RangeSliceResponseResolver$2.getReduced(RangeSliceResponseResolver.java:112) > at org.apache.cassandra.service.RangeSliceResponseResolver$2.getReduced(RangeSliceResponseResolver.java:83) > at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:161) > at org.apache.cassandra.utils.MergeIterator.computeNext(MergeIterator.java:88) > at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) > at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) > at org.apache.cassandra.service.RangeSliceResponseResolver.resolve(RangeSliceResponseResolver.java:120) > at org.apache.cassandra.service.RangeSliceResponseResolver.resolve(RangeSliceResponseResolver.java:43) > Looking at the code in getReduced: > {noformat} > ColumnFamily resolved = versions.size() > 1 > ? RowRepairResolver.resolveSuperset(versions) > : versions.get(0); > {noformat} > seems like resolved becomes null when this happens and versions.size is larger than 1. > RowRepairResolver.resolveSuperset() does actually return null if it cannot resolve anything, so there is definately a case here which can occur and is not handled. > It may also be an interesting question if it is guaranteed that > versions.add(current.left.cf); > can never return null? > Jonathan suggested on IRC that maybe > {noformat} > ColumnFamily resolved = versions.size() > 1 > ? RowRepairResolver.resolveSuperset(versions) > : versions.get(0); > if (resolved == null) > return new Row(key, resolved); > {noformat} > could be a fix. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira