cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrés de la Peña (JIRA) <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7575) Custom 2i validation
Date Thu, 24 Jul 2014 21:44:39 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073690#comment-14073690
] 

Andrés de la Peña commented on CASSANDRA-7575:
----------------------------------------------

The lucene/solr index is created with the CQL's create index statement:

{code}
CREATE CUSTOM INDEX IF NOT EXISTS users_index 
ON tweets (lucene) 
USING '<custom_index_class>'
WITH OPTIONS = {<indexing_options_and_schema>};
{code}  

With your approach, should we previously create the UDF in addition to the 2i? How would we
connect the UDF to the secondary index?

I'm aware that the special column is a hack, but it makes the work and it's so flexible that
is being used successfully for a variety of queries such as indexing systems.

In addition, shipping UDFs independently of the indexes would be a nice feature :)

> Custom 2i validation
> --------------------
>
>                 Key: CASSANDRA-7575
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7575
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API
>            Reporter: Andrés de la Peña
>            Priority: Minor
>              Labels: 2i, cql3, secondaryIndex, secondary_index, select
>         Attachments: 2i_validation.patch
>
>
> There are several projects using custom secondary indexes as an extension point to integrate
C* with other systems such as Solr or Lucene. The usual approach is to embed third party indexing
queries in CQL clauses. 
> For example, [DSE Search|http://www.datastax.com/what-we-offer/products-services/datastax-enterprise]
embeds Solr syntax this way:
> {code}
> SELECT title FROM solr WHERE solr_query='title:natio*';
> {code}
> [Stratio platform|https://github.com/Stratio/stratio-cassandra] embeds custom JSON syntax
for searching in Lucene indexes:
> {code}
> SELECT * FROM tweets WHERE lucene='{
>     filter : {
>         type: "range",
>         field: "time",
>         lower: "2014/04/25",
>         upper: "2014/04/1"
>     },
>     query  : {
>         type: "phrase", 
>         field: "body", 
>         values: ["big", "data"]
>     },
>     sort  : {fields: [ {field:"time", reverse:true} ] }
> }';
> {code}
> Tuplejump [Stargate|http://tuplejump.github.io/stargate/] also uses the Stratio's open
source JSON syntax:
> {code}
> SELECT name,company FROM PERSON WHERE stargate ='{
>     filter: {
>         type: "range",
>         field: "company",
>         lower: "a",
>         upper: "p"
>     },
>     sort:{
>        fields: [{field:"name",reverse:true}]
>     }
> }';
> {code}
> These syntaxes are validated by the corresponding 2i implementation. This validation
is done behind the StorageProxy command distribution. So, far as I know, there is no way to
give rich feedback about syntax errors to CQL users.
> I'm uploading a patch with some changes trying to improve this. I propose adding an empty
validation method to SecondaryIndexSearcher that can be overridden by custom 2i implementations:
> {code}
> public void validate(List<IndexExpression> clause) {}
> {code}
> And call it from SelectStatement#getRangeCommand:
> {code}
> ColumnFamilyStore cfs = Keyspace.open(keyspace()).getColumnFamilyStore(columnFamily());
>         for (SecondaryIndexSearcher searcher : cfs.indexManager.getIndexSearchersForQuery(expressions))
>         {
>             try
>             {
>                 searcher.validate(expressions);
>             }
>             catch (RuntimeException e)
>             {
>                 String exceptionMessage = e.getMessage();
>                 if (exceptionMessage != null 
>                         && !exceptionMessage.trim().isEmpty())
>                     throw new InvalidRequestException(
>                             "Invalid index expression: " + e.getMessage());
>                 else
>                     throw new InvalidRequestException(
>                             "Invalid index expression");
>             }
>         }
> {code}
> In this way C* allows custom 2i implementations to give feedback about syntax errors.
> We are currently using these changes in a fork with no problems.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message