From notifications-return-91891-archive-asf-public=cust-asf.ponee.io@asterixdb.apache.org  Thu Mar 15 00:47:04 2018
Return-Path: <notifications-return-91891-archive-asf-public=cust-asf.ponee.io@asterixdb.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
	by mx-eu-01.ponee.io (Postfix) with SMTP id DDD51180654
	for <archive-asf-public@cust-asf.ponee.io>; Thu, 15 Mar 2018 00:47:03 +0100 (CET)
Received: (qmail 39494 invoked by uid 500); 14 Mar 2018 23:47:03 -0000
Mailing-List: contact notifications-help@asterixdb.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:notifications-help@asterixdb.apache.org>
List-Unsubscribe: <mailto:notifications-unsubscribe@asterixdb.apache.org>
List-Post: <mailto:notifications@asterixdb.apache.org>
List-Id: <notifications.asterixdb.apache.org>
Reply-To: dev@asterixdb.apache.org
Delivered-To: mailing list notifications@asterixdb.apache.org
Received: (qmail 39485 invoked by uid 99); 14 Mar 2018 23:47:02 -0000
Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142)
    by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Mar 2018 23:47:02 +0000
Received: from localhost (localhost [127.0.0.1])
	by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 8CA10C0430
	for <notifications@asterixdb.apache.org>; Wed, 14 Mar 2018 23:47:02 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: -109.511
X-Spam-Level:
X-Spam-Status: No, score=-109.511 tagged_above=-999 required=6.31
	tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8,
	RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01,
	USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled
Received: from mx1-lw-us.apache.org ([10.40.0.8])
	by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024)
	with ESMTP id MhurnEOPimr5 for <notifications@asterixdb.apache.org>;
	Wed, 14 Mar 2018 23:47:01 +0000 (UTC)
Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139])
	by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 0FFC75F238
	for <notifications@asterixdb.incubator.apache.org>; Wed, 14 Mar 2018 23:47:01 +0000 (UTC)
Received: from jira-lw-us.apache.org (unknown [207.244.88.139])
	by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 8E8ADE0114
	for <notifications@asterixdb.incubator.apache.org>; Wed, 14 Mar 2018 23:47:00 +0000 (UTC)
Received: from jira-lw-us.apache.org (localhost [127.0.0.1])
	by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4FE5D21490
	for <notifications@asterixdb.incubator.apache.org>; Wed, 14 Mar 2018 23:47:00 +0000 (UTC)
Date: Wed, 14 Mar 2018 23:47:00 +0000 (UTC)
From: "Till (JIRA)" <jira@apache.org>
To: notifications@asterixdb.incubator.apache.org
Message-ID: <JIRA.13127608.1514508540000.22671.1521071220326@Atlassian.JIRA>
In-Reply-To: <JIRA.13127608.1514508540000@Atlassian.JIRA>
References: <JIRA.13127608.1514508540000@Atlassian.JIRA> <JIRA.13127608.1514508540888@jira-lw-us.apache.org>
Subject: [jira] [Updated] (ASTERIXDB-2215) Filter is not properly applied
 for a secondary inverted index search
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394


     [ https://issues.apache.org/jira/browse/ASTERIXDB-2215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Till updated ASTERIXDB-2215:
----------------------------
    Component/s: IDX - Indexes
                 COMP - Compiler

> Filter is not properly applied for a secondary inverted index search
> --------------------------------------------------------------------
>
>                 Key: ASTERIXDB-2215
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2215
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: COMP - Compiler, IDX - Indexes
>            Reporter: Taewoo Kim
>            Priority: Major
>              Labels: triaged
>
> Based on the way of writing predicate conditions on a field with filter, the generated plan does not correctly show min and max value of a filter.
> {code}
> drop dataverse twitter if exists;
> create dataverse twitter if not exists;
> use dataverse twitter;
> create type typeUser if not exists as open {
>     id: int64,
>     name: string,
>     screen_name : string,
>     profile_image_url : string,
>     lang : string,
>     location: string,
>     create_at: date,
>     description: string,
>     followers_count: int32,
>     friends_count: int32,
>     statues_count: int64
> };
> create type typePlace if not exists as open{
>     country : string,
>     country_code : string,
>     full_name : string,
>     id : string,
>     name : string,
>     place_type : string,
>     bounding_box : rectangle
> };
> create type typeGeoTag if not exists as open {
>     stateID: int32,
>     stateName: string,
>     countyID: int32,
>     countyName: string,
>     cityID: int32?,
>     cityName: string?
> };
> create type typeTweet if not exists as open {
>     create_at : datetime,
>     id: int64,
>     "text": string,
>     in_reply_to_status : int64,
>     in_reply_to_user : int64,
>     favorite_count : int64,
>     coordinate: point?,
>     retweet_count : int64,
>     lang : string,
>     is_retweet: boolean,
>     hashtags : {{ string }} ?,
>     user_mentions : {{ int64 }} ? ,
>     user : typeUser,
>     place : typePlace?,
>     geo_tag: typeGeoTag
> };
> create dataset ds_tweet(typeTweet) if not exists primary key id with filter on create_at;
> {code}
> For the following query, the logical plan shows empty min[] and two variables in max[] when doing an inverted-index search. 
> {code}
> USE twitter;
> SELECT spatial_cell(get_points(place.bounding_box)[0], create_point(0.0,0.0),1.0,1.0) AS cell, count(*) AS cnt FROM ds_tweet
> WHERE ftcontains(text, ['trump'], {'mode':'any'}) AND place.bounding_box IS NOT unknown 
> AND datetime('2017-02-25T00:00:00') <= create_at AND  create_at < datetime('2017-02-26T00:00:00')
> GROUP BY cell;
> {code}
> Exact predicates on the filter
> {code}
> datetime('2017-02-25T00:00:00') <= create_at AND  create_at < datetime('2017-02-26T00:00:00')
> {code}
> {code}
> unnest-map [$$64, $$69, $$70] <- index-search("text_idx", 2, "twitter", "ds_tweet", FALSE, FALSE, 5, null, 21, TRUE, 1, $$63) with filter on min:[] max:[$$67, $$68]
>                                         -- SINGLE_PARTITION_INVERTED_INDEX_SEARCH  |PARTITIONED|
>                                           exchange
>                                           -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>                                             assign [$$67, $$68, $$63] <- [datetime: { 2017-02-26T00:00:00.000Z }, datetime: { 2017-02-25T00:00:00.000Z }, array: [ "trump" ]]
>                                             -- ASSIGN  |PARTITIONED|
>                                               empty-tuple-source
>                                               -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
> {code}
> However, for the following query(just switched the location of datetime and create_at at the end of the predicates), it shows another incorrect plan.
> {code}
> SELECT spatial_cell(get_points(place.bounding_box)[0], create_point(0.0,0.0),1.0,1.0) AS cell, count(*) AS cnt FROM ds_tweet
> WHERE ftcontains(text, ['trump'], {'mode':'any'}) AND place.bounding_box IS NOT unknown 
> AND datetime('2017-02-25T00:00:00') <= create_at AND  datetime('2017-02-26T00:00:00') > create_at
> GROUP BY cell;
> {code}
> Exact predicates on the filter:
> {code}
> datetime('2017-02-25T00:00:00') <= create_at AND  datetime('2017-02-26T00:00:00') > create_at
> {code}
> {code}
> unnest-map [$$64, $$69, $$70] <- index-search("text_idx", 2, "twitter", "ds_tweet", FALSE, FALSE, 5, null, 21, TRUE, 1, $$63) with filter on min:[$$67] max:[$$68]
>                                         -- SINGLE_PARTITION_INVERTED_INDEX_SEARCH  |PARTITIONED|
>                                           exchange
>                                           -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>                                             assign [$$67, $$68, $$63] <- [datetime: { 2017-02-26T00:00:00.000Z }, datetime: { 2017-02-25T00:00:00.000Z }, array: [ "trump" ]]
>                                             -- ASSIGN  |PARTITIONED|
>                                               empty-tuple-source
>                                               -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
> {code}


--
This message was sent by Atlassian JIRA
(v7.6.3#76005)