cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Elias Del Valle <>
Subject Best way to do a multi_get using CQL
Date Fri, 20 Jun 2014 00:56:25 GMT
I was taking a look at Cassandra anti-patterns list:

Among then is

SELECT ... IN or index lookups¶

SELECT ... IN and index lookups (formerly secondary indexes) should be
avoided except for specific scenarios. See *When not to use IN* in SELECT
 and *When not to use an index* in Indexing
*CQL for Cassandra 2.0*"

And Looking at the SELECT doc, I saw:
When *not* to use IN¶
The recommendations about when not to use an index
 apply to using IN in the WHERE clause. Under most conditions, using IN in
the WHERE clause is not recommended. Using IN can degrade performance
because usually many nodes must be queried. For example, in a single, local
data center cluster having 30 nodes, a replication factor of 3, and a
consistency level of LOCAL_QUORUM, a single key query goes out to two
nodes, but if the query uses the IN condition, the number of nodes being
queried are most likely even higher, up to 20 nodes depending on where the
keys fall in the token range."

In my system, I have a column family called "entity_lookup":

  WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy',
  'DC1' : 3 };
USE Identification1;

  name varchar,
  value varchar,
  entity_id uuid,
  PRIMARY KEY ((name, value), entity_id));

And I use the following select to query it:

SELECT entity_id FROM entity_lookup WHERE name=%s and value in(%s)

Is this an anti-pattern?

If not using SELECT IN, which other way would you recomend for lookups like
that? I have several values I would like to search in cassandra and they
might not be in the same particion, as above.

Is Cassandra the wrong tool for lookups like that?

Best regards,
Marcelo Valle.

View raw message