asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From abdullah alamoudi <bamou...@gmail.com>
Subject Re: Feeds UDF
Date Wed, 09 Dec 2015 14:59:11 GMT
The only problem I see is the Halloween problem in case of a self join,
hence the need for materialization(not sure if it is possible in this case
but definitely possible in general). Other than that, I don't think there
is any problem.

Cheers,
Abdullah
On Dec 8, 2015 11:51 PM, "Mike Carey" <dtabass@gmail.com> wrote:

> (I am still completely not seeing a problem here.)
>
> On 12/8/15 10:20 PM, abdullah alamoudi wrote:
>
>> The plan is to mostly use Upsert in the future since we can do some
>> optimizations with it that we can't do with an insert.
>> We should also support deletes as well and probably allow a mix of the
>> three operations within the same feed. This is a work in progress right
>> now
>> but before I go far, I am stabilizing some other parts of the feeds.
>>
>> Cheers,
>> Abdullah.
>>
>>
>> Amoudi, Abdullah.
>>
>> On Tue, Dec 8, 2015 at 10:11 PM, Ildar Absalyamov <
>> ildar.absalyamov@gmail.com> wrote:
>>
>> Abdullah,
>>>
>>> OK, now I see what problems it will cause.
>>> Kinda related question: could the feed implement “upsert” semantics, that
>>> you’ve been working on, instead of “insert” semantics?
>>>
>>> On Dec 8, 2015, at 21:52, abdullah alamoudi <bamousaa@gmail.com> wrote:
>>>>
>>>> I think that we probably should restrict feed applied functions somehow
>>>> (needs further thoughts and discussions) and I know for sure that we
>>>>
>>> don't.
>>>
>>>> As for the case you present, I would imagine that it could be allowed
>>>> theoretically but I think everyone sees why it should be disallowed.
>>>>
>>>> One thing to keep in mind is that we introduce a materialize if the
>>>>
>>> dataset
>>>
>>>> was part of an insert pipeline. Now think about how this would work with
>>>>
>>> a
>>>
>>>> continuous feed. One choice would be that the feed will materialize all
>>>> records to be inserted and once the feed stops, it would start inserting
>>>> them but I still think we should not allow it.
>>>>
>>>> My 2c,
>>>> Any opposing argument?
>>>>
>>>>
>>>> Amoudi, Abdullah.
>>>>
>>>> On Tue, Dec 8, 2015 at 6:28 PM, Ildar Absalyamov <
>>>>
>>> ildar.absalyamov@gmail.com
>>>
>>>> wrote:
>>>>> Hi All,
>>>>>
>>>>> As a part of feed ingestion we do allow preprocessing incoming data
>>>>> with
>>>>> AQL UDFs.
>>>>> I was wondering if we somehow restrict the kind of UDFs that could be
>>>>> used? Do we allow joins in these UDFs? Especially joins with the same
>>>>> dataset, which is used for intake. Ex:
>>>>>
>>>>> create type TweetType as open {
>>>>>   id: string,
>>>>>   username : string,
>>>>>   location : string,
>>>>>   text : string,
>>>>>   timestamp : string
>>>>> }
>>>>> create dataset Tweets(TweetType)
>>>>> primary key id;
>>>>> create function feed_processor($x) {
>>>>> for $y in dataset Tweets
>>>>> // self-join with Tweets dataset on some predicate($x, $y)
>>>>> return $y
>>>>> }
>>>>> create feed TweetFeed
>>>>> apply function feed_processor;
>>>>>
>>>>> The query above fails in runtime, but I was wondering if that
>>>>> theoretically could work at all.
>>>>>
>>>>> Best regards,
>>>>> Ildar
>>>>>
>>>>>
>>>>> Best regards,
>>> Ildar
>>>
>>>
>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message