cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stuart Gunter (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8185) Removing the last item from a collection removes the entire row
Date Mon, 27 Oct 2014 08:35:35 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184951#comment-14184951
] 

Stuart Gunter commented on CASSANDRA-8185:
------------------------------------------

Thanks for your response, Aleksey.

However, I disagree that this is the same issue. Having a TTL that is set on all columns of
a row during an UPDATE is not what I'm concerned about in this issue. Regardless of whether
it was intentional or not, the semantics of the UPDATE statement did change from 2.0.5 to
2.0.6 and has a breaking effect on applications. I don't think this is something that should
happen in a patch release.

Conceptually, the changes meant that {{SET(X) - X = null}} rather than {{SET(X) - X = the
empty set}}. This change is inconsistent with what the programmer expects, and rightfully
so.

The implication of this change is that the primary key has no inherent value and is merely
a pointer to some other valuable data (contained in other columns of the row). Here's an example:

Let's consider a have a web crawler that tags URLs with a set of values. In 2.0.5, it would
be possible for me to have a row in my table that contains a URL without any tags. When I
upgrade to 2.0.6, this is no longer possible. Unless a URL has _at least one_ tag, I cannot
even store it (proven by another test added to the repo in the description). This completely
de-values the data within the PK and relegates it to nothing more than a meaningless pointer.
But in this scenario, the URL itself _is a value_. Data that would be retained in 2.0.5 will
be lost in 2.0.6!

I understand that you see similarities between this issue and CASSANDRA-6782, but I'm concerned
that they're not the same thing and that this should be investigated further. Would it be
possible to get another opinion on this?

> Removing the last item from a collection removes the entire row
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-8185
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8185
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Mac OS X Yosemite
> JDK 8u25
>            Reporter: Stuart Gunter
>            Priority: Minor
>
> When removing the last item from a list or set (and I assume the same applies to maps),
the entire row is deleted rather than just that last item.
> I've only tested this in cases where the collection column is the only non-PK column,
but it is definitely a change between v2.0.5 and v2.0.6. I've looked through the 2.0.6 release
notes and issues and can't find any item that might describe this, so I assume it's not an
intended change. If such an issue exists, I've probably just missed it, so apologies in advance.
> I've created a very simple project to reproduce the issue. If you clone the repo and
run it in two different modes (the README explains how to do this with preconfigured Maven
profiles), you will see the tests that pass on v2.0.5 and fail on v2.0.6.
> Repo: https://github.com/stuartgunter/cassandra-2.0.6-bug



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message