asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Westmann" <ti...@apache.org>
Subject Re: Function name change: contains() -> string-contains()
Date Fri, 16 Sep 2016 00:58:01 GMT
And as it turns out, we already have some infrastructure to translate a
constant record constructor expression into a record in 
LangRecordParseUtil.
So supporting that wouldn’t be too painful.

Cheers,
Till

On 15 Sep 2016, at 17:41, Till Westmann wrote:

> One option to express those parameters, would be to pass in a (compile 
> time
> constant) record/object. E.g.
>
>     where ftcontains($o.title, ["hello","hi"],
>                      { "combine": "and", "stop list": "default" })
>
> That way we could have named optional parameters (please ignore the 
> ugliness of
> my chosen parameters) which avoid the problem of dealing with 
> positions.
> We do have a nested datamodel, so we could put it to good use here :)
>
> Does this make sense?
>
> Cheers,
> Till
>
> On 15 Sep 2016, at 16:26, Taewoo Kim wrote:
>
>> @Till: we can add whether the given search is AND/OR search, stop 
>> list
>> and/or stemming method. For example, if we use ftcontains(), then it 
>> might
>> look like:
>>
>> 1) where ftcontains($o.title, "hello"): find $o where the title field
>> contains hello.
>> 2) where ftcontains($o.title, ["hello","hi"], any): find $o where the 
>> title
>> field contains hello *and/or* hi.
>> 3) where ftcontains($o.title, ["hello","hi"], all): find $o where the 
>> title
>> field contains both hello *and* hi.
>> 4) where ftcontains($o.title, ["hello","hi"], all, defaultstoplist): 
>> find
>> $o where the title field contains both hello *and* hi. Also apply the
>> default stoplist to the search. The default stop list contains the 
>> number
>> of English common words that can be filtered.
>>
>> The issue here is that the position of each parameter should be 
>> observed
>> (e.g., the third one indicates whether we do disjunctive/conjunctive
>> search. The fourth one tells us which stop list we use). So, if we 
>> have
>> three parameters, how to specify/omit these becomes a challenge.
>>
>> Best,
>> Taewoo
>>
>> On Thu, Sep 15, 2016 at 4:12 PM, Till Westmann <tillw@apache.org> 
>> wrote:
>>
>>> Makes sense to me (especially as I always think about this specific 
>>> one as
>>> "ftcontains" :) ).
>>>
>>> Another thing you mentioned is about the parameters that will get 
>>> added in
>>> the
>>> future. Could you provide an example for this?
>>>
>>> Cheers,
>>> Till
>>>
>>> On 15 Sep 2016, at 15:37, Taewoo Kim wrote:
>>>
>>> Maybe we could come up with a function form - *ftcontains*(). Here, 
>>> ft is
>>>>
>>>> an abbreviation for full-text. This function replaces "contains 
>>>> text" in
>>>> XQuery spec. An example might be:
>>>>
>>>> XQuery spec: where $o.titile contains text "hello"
>>>> AQL: where ftcontains($o.title, "hello")
>>>>
>>>> Best,
>>>> Taewoo
>>>>
>>>> On Thu, Sep 15, 2016 at 3:18 PM, Taewoo Kim <wangsaeu@gmail.com> 
>>>> wrote:
>>>>
>>>> @Till: Got it. I agree to your opinion. The issue here for the 
>>>> full-text
>>>>> search is that many function parameters that controls the behavior 
>>>>> of
>>>>> full-text search will be added in the future. Maybe this is not 
>>>>> the
>>>>> issue?
>>>>> :-)
>>>>>
>>>>> Best,
>>>>> Taewoo
>>>>>
>>>>> On Thu, Sep 15, 2016 at 3:11 PM, Till Westmann <tillw@apache.org>

>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>>
>>>>>> I think that our challenge here is, that XQuery is very liberal 
>>>>>> in the
>>>>>> introduction of new keywords, as the grammar is keyword free. 
>>>>>> However,
>>>>>> they
>>>>>> often use combinations of words "contain" "text" to disambiguate.
>>>>>> AQL on the other had is not keyword free and so each time we 
>>>>>> introduce a
>>>>>> new
>>>>>> one, we create a backwards compatibility problem. It seems that 
>>>>>> for AQL
>>>>>> using a
>>>>>> function-based syntax would create fewer problems.
>>>>>>
>>>>>> Cheers,
>>>>>> Till
>>>>>>
>>>>>> On 2 Mar 2016, at 18:25, Taewoo Kim wrote:
>>>>>>
>>>>>> Hello All,
>>>>>>
>>>>>>>
>>>>>>> I would like to suggest a current function name change. I am

>>>>>>> currently
>>>>>>> working on Full Text Search features. XQuery Full-text search

>>>>>>> spec [1]
>>>>>>> states that for a full-text search, the syntax is *RangeExpr
(
>>>>>>> "contains"
>>>>>>> "text" FTSelection FTIgnoreOption? )?*. As you see, we are going

>>>>>>> to use
>>>>>>> "contains text something". And we already have contains() 
>>>>>>> function [2]
>>>>>>> that
>>>>>>> does a substring match.  So, in order to remove possible 
>>>>>>> ambiguities
>>>>>>> between two features, *contains()* will be renamed to
>>>>>>> *string-contains()*
>>>>>>> when I merge my index-only branch to the master if there is no

>>>>>>> strong
>>>>>>> opinion on this. Thank you. I will send another note as my merge
>>>>>>> progresses. Thank you.
>>>>>>>
>>>>>>> [1] https://www.w3.org/TR/xpath-full-text-10/#doc-xquery10-FTCon
>>>>>>> tainsExpr
>>>>>>>
>>>>>>> [2]
>>>>>>> https://asterix-jenkins.ics.uci.edu/job/asterix-test-full/si
>>>>>>> te/asterix-doc/aql/functions.html#StringFunctions
>>>>>>>
>>>>>>> Best,
>>>>>>> Taewoo
>>>>>>>
>>>>>>>
>>>>>>
>>>>>




Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message