On Thu, Nov 7, 2013 at 7:01 PM, Krishna Chaitanya <bnsk1990rulz@gmail.com> wrote:

Check if its an issue with permissions or broken links..



I don't think permissions are an issue. You might be on to something regarding the links.

I've been seeing this on 4 nodes, configured identically.

Here's what I think the problem may be: (or may be a combination of a few problems)

1. I have symlinked the data directories. This confuses Cassandra in some way, causing it to create multiple files. Does Cassandra care if the data directory was symlinked from someplace? Would this cause an issue.

lrwxrwxrwx    1 root root     6 Oct 30 18:37 data01 -> /data1 # [1]

Evidence for:
a. Somehow it's creating duplicate hard links.
b. It is unlikely other Cassandra users would have setup their directories like this and this seems like a serious bug.
c. Also, my other cluster is nearly identical (OS, JVM, 6 drives, same Cassandra/RHQ, hardware similar) and not seeing the same issues, although that is a two node cluster.

If I were to grep through, I guess I would see if there's a chance the path that Java sees, maybe File.getAbsoluteFile() (which might resolve the link) doesn't match the path of another file. In other words, it is a Cassandra bug, based on some assumptions from the JVM


2. When I created the cluster, I had a single data directory for each node. I then added 5 more. Somehow Cassandra mis-remembers where the data was put, causing all sorts of issues. How does Cassandra decide where to put its data and where to read it from? What happens when additional data directories are added? There could be a bug in the code.

Evidence for:
a. Somehow it's looking for data in the wrong directory. It also seems unlikely a user would create a cluster, then add 5 more drives.

# [1] The reason the links are setup is because the mount points didn't match my Puppet setup, which sets up my directory permissions. So I added the links to compensate.