asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Westmann" <ti...@apache.org>
Subject Re: Function name change: contains() -> string-contains()
Date Fri, 16 Sep 2016 00:41:03 GMT
One option to express those parameters, would be to pass in a (compile 
time
constant) record/object. E.g.

     where ftcontains($o.title, ["hello","hi"],
                      { "combine": "and", "stop list": "default" })

That way we could have named optional parameters (please ignore the 
ugliness of
my chosen parameters) which avoid the problem of dealing with positions.
We do have a nested datamodel, so we could put it to good use here :)

Does this make sense?

Cheers,
Till

On 15 Sep 2016, at 16:26, Taewoo Kim wrote:

> @Till: we can add whether the given search is AND/OR search, stop list
> and/or stemming method. For example, if we use ftcontains(), then it 
> might
> look like:
>
> 1) where ftcontains($o.title, "hello"): find $o where the title field
> contains hello.
> 2) where ftcontains($o.title, ["hello","hi"], any): find $o where the 
> title
> field contains hello *and/or* hi.
> 3) where ftcontains($o.title, ["hello","hi"], all): find $o where the 
> title
> field contains both hello *and* hi.
> 4) where ftcontains($o.title, ["hello","hi"], all, defaultstoplist): 
> find
> $o where the title field contains both hello *and* hi. Also apply the
> default stoplist to the search. The default stop list contains the 
> number
> of English common words that can be filtered.
>
> The issue here is that the position of each parameter should be 
> observed
> (e.g., the third one indicates whether we do disjunctive/conjunctive
> search. The fourth one tells us which stop list we use). So, if we 
> have
> three parameters, how to specify/omit these becomes a challenge.
>
> Best,
> Taewoo
>
> On Thu, Sep 15, 2016 at 4:12 PM, Till Westmann <tillw@apache.org> 
> wrote:
>
>> Makes sense to me (especially as I always think about this specific 
>> one as
>> "ftcontains" :) ).
>>
>> Another thing you mentioned is about the parameters that will get 
>> added in
>> the
>> future. Could you provide an example for this?
>>
>> Cheers,
>> Till
>>
>> On 15 Sep 2016, at 15:37, Taewoo Kim wrote:
>>
>> Maybe we could come up with a function form - *ftcontains*(). Here, 
>> ft is
>>>
>>> an abbreviation for full-text. This function replaces "contains 
>>> text" in
>>> XQuery spec. An example might be:
>>>
>>> XQuery spec: where $o.titile contains text "hello"
>>> AQL: where ftcontains($o.title, "hello")
>>>
>>> Best,
>>> Taewoo
>>>
>>> On Thu, Sep 15, 2016 at 3:18 PM, Taewoo Kim <wangsaeu@gmail.com> 
>>> wrote:
>>>
>>> @Till: Got it. I agree to your opinion. The issue here for the 
>>> full-text
>>>> search is that many function parameters that controls the behavior 
>>>> of
>>>> full-text search will be added in the future. Maybe this is not the
>>>> issue?
>>>> :-)
>>>>
>>>> Best,
>>>> Taewoo
>>>>
>>>> On Thu, Sep 15, 2016 at 3:11 PM, Till Westmann <tillw@apache.org> 
>>>> wrote:
>>>>
>>>> Hi,
>>>>>
>>>>> I think that our challenge here is, that XQuery is very liberal in 
>>>>> the
>>>>> introduction of new keywords, as the grammar is keyword free. 
>>>>> However,
>>>>> they
>>>>> often use combinations of words "contain" "text" to disambiguate.
>>>>> AQL on the other had is not keyword free and so each time we 
>>>>> introduce a
>>>>> new
>>>>> one, we create a backwards compatibility problem. It seems that 
>>>>> for AQL
>>>>> using a
>>>>> function-based syntax would create fewer problems.
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>> On 2 Mar 2016, at 18:25, Taewoo Kim wrote:
>>>>>
>>>>> Hello All,
>>>>>
>>>>>>
>>>>>> I would like to suggest a current function name change. I am 
>>>>>> currently
>>>>>> working on Full Text Search features. XQuery Full-text search 
>>>>>> spec [1]
>>>>>> states that for a full-text search, the syntax is *RangeExpr (
>>>>>> "contains"
>>>>>> "text" FTSelection FTIgnoreOption? )?*. As you see, we are going

>>>>>> to use
>>>>>> "contains text something". And we already have contains() 
>>>>>> function [2]
>>>>>> that
>>>>>> does a substring match.  So, in order to remove possible 
>>>>>> ambiguities
>>>>>> between two features, *contains()* will be renamed to
>>>>>> *string-contains()*
>>>>>> when I merge my index-only branch to the master if there is no 
>>>>>> strong
>>>>>> opinion on this. Thank you. I will send another note as my merge
>>>>>> progresses. Thank you.
>>>>>>
>>>>>> [1] https://www.w3.org/TR/xpath-full-text-10/#doc-xquery10-FTCon
>>>>>> tainsExpr
>>>>>>
>>>>>> [2]
>>>>>> https://asterix-jenkins.ics.uci.edu/job/asterix-test-full/si
>>>>>> te/asterix-doc/aql/functions.html#StringFunctions
>>>>>>
>>>>>> Best,
>>>>>> Taewoo
>>>>>>
>>>>>>
>>>>>
>>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message