We're in exactly the same boat. We are waiting on DataStax Enterprise to see if it can ease the pain of SOLR schemas.
In the meantime, I just submitted a native REST layer for Cassandra.
(Hopefully, it will get integrated soon. Vote it up ;)
With a simple REST layer, I'm making the case that we can use Cassandra just like CouchDB. (so we don't have to deploy both)
Extending that assertion, I think I could enhance the REST layer to provide a stream of changes just like CouchDB does. Elastic Search could tap into that stream as a river. Just like this…
That combination would be pretty powerful. If we can't get that setup, we may fallback to an AOPish strategy as well.
Definitely let me know where you end up. I'll share our findings as well.
---- Brian O'Neill
Lead Architect, Software Development
Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406
At the moment we are only prototyping so we haven't bridged the two at all. We had planned on creating a write-through operation that allowed us to filter the calls (AOP perhaps?) to manage the indexing as we stored it in Cassandra.
We are still trying to work out if we go the elastic search route or not as DataStax will be releasing DataStax Enterprise 2.0 early next year with Solr built in and as you said the index schemas seem to be difficult to deal with - I really don't want to have to configure Solr, the no schema approach sounds much faster to get up and running.
On Tue, Oct 18, 2011 at 6:14 AM, Brian O'Neill <email@example.com>
We've been looking at elastic search as well. Presently we have SOLR in place, but it is cumbersome dealing with SOLR schemas when indexing information out of Cassandra (since you can't anticipate all the columns ahead of time).
What are you using as your bridge between Cassandra and ES? Are you developing a Cassandra river?
On Mon, Oct 17, 2011 at 5:29 PM, Anthony Ikeda <firstname.lastname@example.org>
I've already posted to the elasticsearch groups and thought it prudent to also ask here.
We are looking at using elastic search to index our data that we currently store to Cassandra. I was wondering if there are any concerns running elastic search on the same nodes that we use for Cassandra? We have a ring of 6 nodes (2 DCs each with 3 nodes) I was thinking of installing elastic search on 2 nodes in each datacentre - maybe all three. The only reason I'd use the same infrastructure would be because we have the distributed visibility already in place.
Has anyone else taken this approach? Pros? Cons?
Lead Architect, Health Market Science (http://healthmarketscience.com)