Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5BD149963 for ; Fri, 1 Jun 2012 11:20:56 +0000 (UTC) Received: (qmail 48213 invoked by uid 500); 1 Jun 2012 11:20:53 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 48111 invoked by uid 500); 1 Jun 2012 11:20:52 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 48084 invoked by uid 99); 1 Jun 2012 11:20:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Jun 2012 11:20:51 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_FILL_THIS_FORM_SHORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mishra.vivs@gmail.com designates 209.85.210.44 as permitted sender) Received: from [209.85.210.44] (HELO mail-pz0-f44.google.com) (209.85.210.44) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Jun 2012 11:20:47 +0000 Received: by dacx6 with SMTP id x6so2640554dac.31 for ; Fri, 01 Jun 2012 04:20:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=eCHpcUqx0bfRNsCZ0oFrAJY2kOZ/4UN8DqhxXJOqJw8=; b=jrZcRG9EgzEtC84JnNAdRjJSsl0oLB+K6uhXsHzYBqvzYnOvvgBvA/kNsqyajAppgN L2x7Y+rIkUZ/7JJ0ifGbu20Oz4s66gzMNh742UFnJbRLfeo3bU3G/q4Nwxy7hx4HbfDZ KmyM6Ygq1sYEHzmpVnqVenjcyHpxwK82lmpdcp15x2l2bQzfqZ7haM9dnLrxZdDkTsKj 4zKdgfvToQr0AE7woezYsvGHVa5Lx/q0Ro3mhEiCCaPwRIfaStdVQx123wA2XjRomNrg W8g5tky3jJ8DIqZQVDolUI+Lw1vWiLt5XKslL07YujnEWb0sTYcOC3A29dQX/Z4QBLgo 3aUw== MIME-Version: 1.0 Received: by 10.68.236.129 with SMTP id uu1mr9416064pbc.77.1338549626643; Fri, 01 Jun 2012 04:20:26 -0700 (PDT) Received: by 10.66.220.4 with HTTP; Fri, 1 Jun 2012 04:20:26 -0700 (PDT) In-Reply-To: <8845493D-8CAF-4DA9-A0B9-B7C96797D2E2@thelastpickle.com> References: <8845493D-8CAF-4DA9-A0B9-B7C96797D2E2@thelastpickle.com> Date: Fri, 1 Jun 2012 16:50:26 +0530 Message-ID: Subject: Re: How can we use composite indexes and secondary indexes together From: Vivek Mishra To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7b33d92402a3d904c1675fed X-Virus-Checked: Checked by ClamAV on apache.org --047d7b33d92402a3d904c1675fed Content-Type: text/plain; charset=ISO-8859-1 Have a look at Kundera (https://github.com/impetus-opensource/Kundera). It does provide some sort of support (using Lucene) and allow you to deal with association in JPA way. -Vivek On Fri, Jun 1, 2012 at 6:54 AM, aaron morton wrote: > If you want to do arbitrary complex online / realtime queries look at Data > Stax Enterprise, or https://github.com/tjake/Solandra or straight Solr. > > Alternatively denormalise the model to materialise the results when you > insert so you query is a straight lookup. Or do some client side filtering > / aggregation. > > If you want to do the queries offline, you can use Pig or Hive with Hadoop > over Cassandra. The Apache Cassandra distro includes the pig support, hive > is coming (i think) and there are Hadoop interfaces. You can also look at > Data Stax Enterprise. > > > Cheers > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 31/05/2012, at 11:07 PM, Nury Redjepow wrote: > > We want to use cassandra to store complex data. But we can't figure out, > how to organize indexes. > > Our table (column family) looks like this: > > Users = { RandomId int, Firstname varchar, Lastname varchar, Age int, > Country int, ChildCount int } > > In our queries we have mandatory fields (Firstname,Lastname,Age) and extra > search options (Country,ChildCount). How do we organize index to make this > kind of queries fast? > > First I thought, it would be natural to make composite index on > (Firstname,Lastname,Age) and add separate secondary index on remaining > fields (Country and ChildCount). But I can't insert rows into table after > creating secondary indexes. And also, I can't query the table. > > I'm using cassandra 1.1.0, and cqlsh with --cql3 option. > > Any other suggestions to solve our problem (complex queries with mandatory > and additional options) are welcome. > The main point is, how can we join data in cassandra. If I make few index > column families, I need to intersect the values, to get rows that pass all > search criteria??? Or should I use something based on Hadoop (Pig,Hive) to > make such queries? > > Respectfully, Nury > > ------------------------------ > > ------------------------------ > > ------------------------------ > > ------------------------------ > > > --047d7b33d92402a3d904c1675fed Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Have a look at Kundera (https://github.com/impetus-opensource/Kundera). It does provide = some sort of support (using Lucene) and allow you to deal with association = in JPA way.

-Vivek

On Fri, Jun 1, 2012 at 6:54 AM= , aaron morton <aaron@thelastpickle.com> wrote:
If you want to do arbitrary comple= x online / realtime queries look at Data Stax Enterprise, or=A0https://github.com/tjak= e/Solandra=A0or straight Solr.=A0

Alternatively denormalise the model to materialise the = results when you insert so you query is a straight lookup. Or do some clien= t side filtering / aggregation.=A0

If you want to = do the queries offline, you can use Pig or Hive with Hadoop over Cassandra.= The Apache Cassandra distro includes the pig support, hive is coming (i th= ink) and there are Hadoop interfaces. =A0You can also look at Data Stax Ent= erprise.=A0

=A0
Cheers

-----------------
Aaron Morton
Freelance Deve= loper
@aaronmorton

On 31/05/2012, at 11:07 PM, Nury Redjepow wrote:

We want to use cassandra to store complex data. But we can't figure out= , how to organize indexes.

Our table (column family) looks like this:

Users =3D { RandomId int, Firstname varchar, Lastname varchar, Age int, Cou= ntry int, ChildCount int }

In our queries we have mandatory fields (Firstname,Lastname,Age) and extra = search options (Country,ChildCount). How do we organize index to make this = kind of queries fast?

First I thought, it would be natural to make composite index on (Firstname,= Lastname,Age) and add separate secondary index on remaining fields (Country= and ChildCount). But I can't insert rows into table after creating sec= ondary indexes. And also, I can't query the table.

I&#= 39;m using cassandra 1.1.0, and cqlsh with --cql3 option.

Any= other suggestions to solve our problem (complex queries with mandatory and= additional options) are welcome.

The main point is, how can we join data in cassandra. If I make few index c= olumn families, I need to intersect the values, to get rows that pass all s= earch criteria??? Or should I use something based on Hadoop (Pig,Hive) to m= ake such queries?

Respectfully, Nury
=09 =09 =09
=09


=09 =09 =09
=09


=09 =09 =09
=09


=09 =09 =09
=09




--047d7b33d92402a3d904c1675fed--