cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andres de la Peña <adelap...@stratio.com>
Subject Re: Lucene index plugin for Apache Cassandra
Date Sat, 13 Jun 2015 12:19:31 GMT
Thanks for showing interest.

Faceting is not yet supported, but it is in our roadmap. Our goal is to add
to Cassandra as many Lucene features as possible.

2015-06-12 18:21 GMT+02:00 Mohammed Guller <mohammed@glassbeam.com>:

>  The plugin looks cool. Thank you for open sourcing it.
>
>
>
> Does it support faceting and other Solr functionality?
>
>
>
> Mohammed
>
>
>
> *From:* Andres de la Peña [mailto:adelapena@stratio.com]
> *Sent:* Friday, June 12, 2015 3:43 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Lucene index plugin for Apache Cassandra
>
>
>
> I really appreciate your interest
>
>
>
> Well, the first recommendation is to not use it unless you need it,
> because a properly Cassandra denormalized model is almost always preferable
> to indexing. Lucene indexing is a good option when there is no viable
> denormalization alternative. This is the case of range queries over
> multiple dimensions, full-text search or maybe complex boolean predicates.
> It's also appropriate for Spark/Hadoop jobs mapping a small fraction of the
> total amount of rows in a certain table, if you can pay the cost of
> indexing.
>
>
>
> Lucene indexes run inside C*, so users should closely monitor the amount
> of used memory. It's also a good idea to put the Lucene directory files in
> a separate disk to those used by C* itself. Additionally, you should
> consider that indexed tables write throughput will be appreciably reduced,
> maybe to a few thousands rows per second.
>
>
>
> It's really hard to estimate the amount of resources needed by the index
> due to the great variety of indexing and querying ways that Lucene offers,
> so the only thing we can suggest is to empirically find the optimal setup
> for your use case.
>
>
>
> 2015-06-12 12:00 GMT+02:00 Carlos Rolo <rolo@pythian.com>:
>
> Seems like an interesting tool!
>
> What operational recommendations would you make to users of this tool
> (Extra hardware capacity, extra metrics to monitor, etc)?
>
>
>     Regards,
>
>
>
> Carlos Juzarte Rolo
>
> Cassandra Consultant
>
>
>
> Pythian - Love your data
>
>
>
> rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
> <http://linkedin.com/in/carlosjuzarterolo>*
>
> Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
>
> www.pythian.com
>
>
>
> On Fri, Jun 12, 2015 at 11:07 AM, Andres de la Peña <adelapena@stratio.com>
> wrote:
>
> Unfortunately, we don't have published any benchmarks yet, but we have
> plans to do it as soon as possible. However, you can expect a similar
> behavior as those of Elasticsearch or Solr, with some overhead due to the
> need for indexing both the Cassandra's row key and the partition's token.
> You can also take a look at this presentation
> <http://planetcassandra.org/video-presentations/vp/cassandra-summit-europe-2014/vd/stratio-advanced-search-and-top-k-queries-in-cassandra/>
> to see how cluster distribution is done.
>
>
>
> 2015-06-12 0:45 GMT+02:00 Ben Bromhead <ben@instaclustr.com>:
>
> Looks awesome, do you have any examples/benchmarks of using these indexes
> for various cluster sizes e.g. 20 nodes, 60 nodes, 100s+?
>
>
>
> On 10 June 2015 at 09:08, Andres de la Peña <adelapena@stratio.com> wrote:
>
> Hi all,
>
>
>
> With the release of Cassandra 2.1.6, Stratio is glad to present its open
> source Lucene-based implementation of C* secondary indexes
> <https://github.com/Stratio/cassandra-lucene-index> as a plugin that can
> be attached to Apache Cassandra. Before the above changes, Lucene index was
> distributed inside a fork of Apache Cassandra, with all the difficulties
> implied. As of now, the fork is discontinued and new users should use the
> recently created plugin, which maintains all the features of Stratio
> Cassandra <https://github.com/Stratio/stratio-cassandra>.
>
>
>
> Stratio's Lucene index extends Cassandra’s functionality to provide near
> real-time distributed search engine capabilities such as with ElasticSearch
> or Solr, including full text search capabilities, free multivariable
> search, relevance queries and field-based sorting. Each node indexes its
> own data, so high availability and scalability is guaranteed.
>
>
>
> We hope this will be useful to the Apache Cassandra community.
>
>
>
> Regards,
>
>
>
> --
>
>
>   Andrés de la Peña
>
>
>
> <http://www.stratio.com/>
> Avenida de Europa, 26. Ática 5. 3ª Planta
>
> 28224 Pozuelo de Alarcón, Madrid
>
> Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>*
>
>
>
>
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | (650) 284 9692
>
>
>
>
>
> --
>
>
>   Andrés de la Peña
>
>
>
> <http://www.stratio.com/>
> Avenida de Europa, 26. Ática 5. 3ª Planta
>
> 28224 Pozuelo de Alarcón, Madrid
>
> Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>*
>
>
>
>
>
> --
>
>
>
>
>
>
>
> --
>
>
>   Andrés de la Peña
>
>
>
> <http://www.stratio.com/>
> Avenida de Europa, 26. Ática 5. 3ª Planta
>
> 28224 Pozuelo de Alarcón, Madrid
>
> Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>*
>



-- 

Andrés de la Peña


<http://www.stratio.com/>
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>*

Mime
View raw message