asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Taewoo Kim <wangs...@gmail.com>
Subject Re: Function name change: contains() -> string-contains()
Date Thu, 15 Sep 2016 23:26:17 GMT
@Till: we can add whether the given search is AND/OR search, stop list
and/or stemming method. For example, if we use ftcontains(), then it might
look like:

1) where ftcontains($o.title, "hello"): find $o where the title field
contains hello.
2) where ftcontains($o.title, ["hello","hi"], any): find $o where the title
field contains hello *and/or* hi.
3) where ftcontains($o.title, ["hello","hi"], all): find $o where the title
field contains both hello *and* hi.
4) where ftcontains($o.title, ["hello","hi"], all, defaultstoplist): find
$o where the title field contains both hello *and* hi. Also apply the
default stoplist to the search. The default stop list contains the number
of English common words that can be filtered.

The issue here is that the position of each parameter should be observed
(e.g., the third one indicates whether we do disjunctive/conjunctive
search. The fourth one tells us which stop list we use). So, if we have
three parameters, how to specify/omit these becomes a challenge.



Best,
Taewoo

On Thu, Sep 15, 2016 at 4:12 PM, Till Westmann <tillw@apache.org> wrote:

> Makes sense to me (especially as I always think about this specific one as
> "ftcontains" :) ).
>
> Another thing you mentioned is about the parameters that will get added in
> the
> future. Could you provide an example for this?
>
> Cheers,
> Till
>
> On 15 Sep 2016, at 15:37, Taewoo Kim wrote:
>
> Maybe we could come up with a function form - *ftcontains*(). Here, ft is
>>
>> an abbreviation for full-text. This function replaces "contains text" in
>> XQuery spec. An example might be:
>>
>> XQuery spec: where $o.titile contains text "hello"
>> AQL: where ftcontains($o.title, "hello")
>>
>> Best,
>> Taewoo
>>
>> On Thu, Sep 15, 2016 at 3:18 PM, Taewoo Kim <wangsaeu@gmail.com> wrote:
>>
>> @Till: Got it. I agree to your opinion. The issue here for the full-text
>>> search is that many function parameters that controls the behavior of
>>> full-text search will be added in the future. Maybe this is not the
>>> issue?
>>> :-)
>>>
>>> Best,
>>> Taewoo
>>>
>>> On Thu, Sep 15, 2016 at 3:11 PM, Till Westmann <tillw@apache.org> wrote:
>>>
>>> Hi,
>>>>
>>>> I think that our challenge here is, that XQuery is very liberal in the
>>>> introduction of new keywords, as the grammar is keyword free. However,
>>>> they
>>>> often use combinations of words "contain" "text" to disambiguate.
>>>> AQL on the other had is not keyword free and so each time we introduce a
>>>> new
>>>> one, we create a backwards compatibility problem. It seems that for AQL
>>>> using a
>>>> function-based syntax would create fewer problems.
>>>>
>>>> Cheers,
>>>> Till
>>>>
>>>> On 2 Mar 2016, at 18:25, Taewoo Kim wrote:
>>>>
>>>> Hello All,
>>>>
>>>>>
>>>>> I would like to suggest a current function name change. I am currently
>>>>> working on Full Text Search features. XQuery Full-text search spec [1]
>>>>> states that for a full-text search, the syntax is *RangeExpr (
>>>>> "contains"
>>>>> "text" FTSelection FTIgnoreOption? )?*. As you see, we are going to use
>>>>> "contains text something". And we already have contains() function [2]
>>>>> that
>>>>> does a substring match.  So, in order to remove possible ambiguities
>>>>> between two features, *contains()* will be renamed to
>>>>> *string-contains()*
>>>>> when I merge my index-only branch to the master if there is no strong
>>>>> opinion on this. Thank you. I will send another note as my merge
>>>>> progresses. Thank you.
>>>>>
>>>>> [1] https://www.w3.org/TR/xpath-full-text-10/#doc-xquery10-FTCon
>>>>> tainsExpr
>>>>>
>>>>> [2]
>>>>> https://asterix-jenkins.ics.uci.edu/job/asterix-test-full/si
>>>>> te/asterix-doc/aql/functions.html#StringFunctions
>>>>>
>>>>> Best,
>>>>> Taewoo
>>>>>
>>>>>
>>>>
>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message