asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Taewoo Kim <wangs...@gmail.com>
Subject Re: Function name change: contains() -> string-contains()
Date Fri, 16 Sep 2016 00:58:50 GMT
@Till: I see. Thanks for the suggestion. It's more clearer now.

Best,
Taewoo

On Thu, Sep 15, 2016 at 5:58 PM, Till Westmann <tillw@apache.org> wrote:

> And as it turns out, we already have some infrastructure to translate a
> constant record constructor expression into a record in
> LangRecordParseUtil.
> So supporting that wouldn’t be too painful.
>
> Cheers,
> Till
>
>
> On 15 Sep 2016, at 17:41, Till Westmann wrote:
>
> One option to express those parameters, would be to pass in a (compile time
>> constant) record/object. E.g.
>>
>>     where ftcontains($o.title, ["hello","hi"],
>>                      { "combine": "and", "stop list": "default" })
>>
>> That way we could have named optional parameters (please ignore the
>> ugliness of
>> my chosen parameters) which avoid the problem of dealing with positions.
>> We do have a nested datamodel, so we could put it to good use here :)
>>
>> Does this make sense?
>>
>> Cheers,
>> Till
>>
>> On 15 Sep 2016, at 16:26, Taewoo Kim wrote:
>>
>> @Till: we can add whether the given search is AND/OR search, stop list
>>> and/or stemming method. For example, if we use ftcontains(), then it
>>> might
>>> look like:
>>>
>>> 1) where ftcontains($o.title, "hello"): find $o where the title field
>>> contains hello.
>>> 2) where ftcontains($o.title, ["hello","hi"], any): find $o where the
>>> title
>>> field contains hello *and/or* hi.
>>> 3) where ftcontains($o.title, ["hello","hi"], all): find $o where the
>>> title
>>> field contains both hello *and* hi.
>>> 4) where ftcontains($o.title, ["hello","hi"], all, defaultstoplist): find
>>> $o where the title field contains both hello *and* hi. Also apply the
>>> default stoplist to the search. The default stop list contains the number
>>> of English common words that can be filtered.
>>>
>>> The issue here is that the position of each parameter should be observed
>>> (e.g., the third one indicates whether we do disjunctive/conjunctive
>>> search. The fourth one tells us which stop list we use). So, if we have
>>> three parameters, how to specify/omit these becomes a challenge.
>>>
>>> Best,
>>> Taewoo
>>>
>>> On Thu, Sep 15, 2016 at 4:12 PM, Till Westmann <tillw@apache.org> wrote:
>>>
>>> Makes sense to me (especially as I always think about this specific one
>>>> as
>>>> "ftcontains" :) ).
>>>>
>>>> Another thing you mentioned is about the parameters that will get added
>>>> in
>>>> the
>>>> future. Could you provide an example for this?
>>>>
>>>> Cheers,
>>>> Till
>>>>
>>>> On 15 Sep 2016, at 15:37, Taewoo Kim wrote:
>>>>
>>>> Maybe we could come up with a function form - *ftcontains*(). Here, ft
>>>> is
>>>>
>>>>>
>>>>> an abbreviation for full-text. This function replaces "contains text"
>>>>> in
>>>>> XQuery spec. An example might be:
>>>>>
>>>>> XQuery spec: where $o.titile contains text "hello"
>>>>> AQL: where ftcontains($o.title, "hello")
>>>>>
>>>>> Best,
>>>>> Taewoo
>>>>>
>>>>> On Thu, Sep 15, 2016 at 3:18 PM, Taewoo Kim <wangsaeu@gmail.com>
>>>>> wrote:
>>>>>
>>>>> @Till: Got it. I agree to your opinion. The issue here for the
>>>>> full-text
>>>>>
>>>>>> search is that many function parameters that controls the behavior
of
>>>>>> full-text search will be added in the future. Maybe this is not the
>>>>>> issue?
>>>>>> :-)
>>>>>>
>>>>>> Best,
>>>>>> Taewoo
>>>>>>
>>>>>> On Thu, Sep 15, 2016 at 3:11 PM, Till Westmann <tillw@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>>
>>>>>>> I think that our challenge here is, that XQuery is very liberal
in
>>>>>>> the
>>>>>>> introduction of new keywords, as the grammar is keyword free.
>>>>>>> However,
>>>>>>> they
>>>>>>> often use combinations of words "contain" "text" to disambiguate.
>>>>>>> AQL on the other had is not keyword free and so each time we
>>>>>>> introduce a
>>>>>>> new
>>>>>>> one, we create a backwards compatibility problem. It seems that
for
>>>>>>> AQL
>>>>>>> using a
>>>>>>> function-based syntax would create fewer problems.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Till
>>>>>>>
>>>>>>> On 2 Mar 2016, at 18:25, Taewoo Kim wrote:
>>>>>>>
>>>>>>> Hello All,
>>>>>>>
>>>>>>>
>>>>>>>> I would like to suggest a current function name change. I
am
>>>>>>>> currently
>>>>>>>> working on Full Text Search features. XQuery Full-text search
spec
>>>>>>>> [1]
>>>>>>>> states that for a full-text search, the syntax is *RangeExpr
(
>>>>>>>> "contains"
>>>>>>>> "text" FTSelection FTIgnoreOption? )?*. As you see, we are
going to
>>>>>>>> use
>>>>>>>> "contains text something". And we already have contains()
function
>>>>>>>> [2]
>>>>>>>> that
>>>>>>>> does a substring match.  So, in order to remove possible
ambiguities
>>>>>>>> between two features, *contains()* will be renamed to
>>>>>>>> *string-contains()*
>>>>>>>> when I merge my index-only branch to the master if there
is no
>>>>>>>> strong
>>>>>>>> opinion on this. Thank you. I will send another note as my
merge
>>>>>>>> progresses. Thank you.
>>>>>>>>
>>>>>>>> [1] https://www.w3.org/TR/xpath-full-text-10/#doc-xquery10-FTCon
>>>>>>>> tainsExpr
>>>>>>>>
>>>>>>>> [2]
>>>>>>>> https://asterix-jenkins.ics.uci.edu/job/asterix-test-full/si
>>>>>>>> te/asterix-doc/aql/functions.html#StringFunctions
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Taewoo
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message