asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From abdullah alamoudi <bamou...@gmail.com>
Subject Re: MultiTransactionJobletEventListenerFactory
Date Thu, 16 Nov 2017 23:10:21 GMT
We are using multiple transactions in a single job in case of feed and I think that this is
the correct way.
Having a single job for a feed that feeds into multiple datasets is a good thing since job
resources/feed resources are consolidated.

Here are some points:
- We can't use the same transaction id to feed multiple datasets. The only other option is
to have multiple jobs each feeding a different dataset.
- Having multiple jobs (in addition to the extra resources used, memory and CPU) would then
forces us to either read data from external sources multiple times, parse records multiple
times, etc
  or having to have a synchronization between the different jobs and the feed source within
asterixdb. IMO, this is far more complicated than having multiple transactions within a single
job and the cost far outweigh the benefits.

P.S,
We are also using this for bucket connections in Couchbase Analytics.

> On Nov 16, 2017, at 2:57 PM, Till Westmann <tillw@apache.org> wrote:
> 
> If there are a number of issue with supporting multiple transaction ids
> and no clear benefits/use-cases, I’d vote for simplification :)
> Also, code that’s not being used has a tendency to "rot" and so I think
> that it’s usefulness might be limited by the time we’d find a use for
> this functionality.
> 
> My 2c,
> Till
> 
> On 16 Nov 2017, at 13:57, Xikui Wang wrote:
> 
>> I'm separating the connections into different jobs in some of my
>> experiments... but that was intended to be used for the experimental
>> settings (i.e., not for master now)...
>> 
>> I think the interesting question here is whether we want to allow one
>> Hyracks job to carry multiple transactions. I personally think that should
>> be allowed as the transaction and job are two separate concepts, but I
>> couldn't find such use cases other than the feeds. Does anyone have a good
>> example on this?
>> 
>> Another question is, if we do allow multiple transactions in a single
>> Hyracks job, how do we enable commit runtime to obtain the correct TXN id
>> without having that embedded as part of the job specification.
>> 
>> Best,
>> Xikui
>> 
>> On Thu, Nov 16, 2017 at 1:01 PM, abdullah alamoudi <bamousaa@gmail.com>
>> wrote:
>> 
>>> I am curious as to how feed will work without this?
>>> 
>>> ~Abdullah.
>>>> On Nov 16, 2017, at 12:43 PM, Steven Jacobs <sjaco002@ucr.edu> wrote:
>>>> 
>>>> Hi all,
>>>> We currently have MultiTransactionJobletEventListenerFactory, which
>>> allows
>>>> for one Hyracks job to run multiple Asterix transactions together.
>>>> 
>>>> This class is only used by feeds, and feeds are in process of changing to
>>>> no longer need this feature. As part of the work in pre-deploying job
>>>> specifications to be used by multiple hyracks jobs, I've been working on
>>>> removing the transaction id from the job specifications, as we use a new
>>>> transaction for each invocation of a deployed job.
>>>> 
>>>> There is currently no clear way to remove the transaction id from the job
>>>> spec and keep the option for MultiTransactionJobletEventListenerFactory.
>>>> 
>>>> The question for the group is, do we see a need to maintain this class
>>> that
>>>> will no longer be used by any current code? Or, an other words, is there
>>> a
>>>> strong possibility that in the future we will want multiple transactions
>>> to
>>>> share a single Hyracks job, meaning that it is worth figuring out how to
>>>> maintain this class?
>>>> 
>>>> Steven
>>> 
>>> 


Mime
View raw message