gobblin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vicky Kak <vicky....@gmail.com>
Subject Re: Gobblin As Service Questions
Date Wed, 26 Jul 2017 13:21:32 GMT

I am not able to see my last mail send to the user@
gobblin.incubator.apache.org here yet

Should I not be able to see it without much lag, the last mail I have send
was almost 35 minutes ago?


On Wed, Jul 26, 2017 at 6:15 PM, Vicky Kak <vicky.kak@gmail.com> wrote:

> Hi,
> I did spend more time looking at the code details and have following to
> share.
> I do see that GobblinServiceManager( this is bootstrap class for the
> gobblin service) performing these
> 1) Initialising the TopologyCatalog,FlowCatalog,Helix,ServiceScheduler,EmbeddedLiServer
> and finally Orchestator/TopologySpecFactory.
> 2) The FlowConfigClient seems to creating the FlowConfig, then FlowSpec
> via FlowConfigResource ( via RestEndpoint).
> 3) The JobSpec gets added to the FlowCatalog after which the Orchestrator
> pushes the JobSpec to the Kafka via SimpleKafkaStepExecutionProducer.
> I have been looking for a code which will use the
> SimpleKafkaStepExecutionConsumer,  but could not find how it is hooked
> with the running instance of the Gobblin.
> Here is how the gobblin service will invoke the Jobs on slaves( gobblin
> instances)
> 1) We should have the rest endpoint information so that we can send the
> JobSpec via FlowConfigClient or via the HTTP GET( rest call, I have not yet
> tried this). I don't see a way to get the port when the rest server is
> started.
> 2) The JobSpec is passed to the Kafka via the
> SimpleKafkaStepExecutionProducer from the gobblin service via
> Orchestrator.
> 3) There could be multiple instances of the Gobblin which could be
> listening to the Kafka using the SimpleKafkaStepExecutionConsumer, all
> the Gobblin instance should get the JobSpecs. The one instance which
> matches the job specs should trigger the Job.
> The Gobblin service acts as a master and provides the rest endpoint to
> read/create the JobSpecs which will get triggered on the slaves( which are
> the Gobblin instances).
> I have yet not been able to run the flow since there are some build issues
> I am getting via building the gobblin from the master, the tests are
> failing right now.
> Can someone from the development team validate if I am on right tract in
> terms of understanding the implementation and flows?
> I have got more questions which I will post after I confirm that I am not
> missing anything.
> Thanks,
> Vicky
> On Tue, Jul 25, 2017 at 5:03 PM, Vicky Kak <vicky.kak@gmail.com> wrote:
>> To my surprise after I looked at the code and referred the presentation
>> that Shrishanka had send my ignorance about Gobblin As A Service was removed
>> Gobblin As a service : It is a Global Orchestrator which helps in
>> submitting the logical flow specifications which are further compiled to
>> the physical pipelines.
>> We have been triggering the Gobblin Jobs using the RestEnd point and it
>> is done by implementing the custom service as explained here
>> https://groups.google.com/forum/#!topic/gobblin-users/kHrWh6lfGJM
>> I have got the following questions
>> 1) What is the use case for Gobblin As service, I don't see the
>> Orchestrator's rest endpoint port being configurable. If we have to add
>> FlowSpec using the different machine we need to know the Orchestrator's
>> host and port details, how do we do it?
>> 2) Does FlowSpec creation creates a new Job deployment which can also by
>> copying the corresponding .pull or .job file in the gobblin distribution?
>> 3) Since the master.out log gets created when starting a service, I
>> assume there could be a way to add more Orchestrators to the master that is
>> started. However I am not sure how to do that, can this be clarified?
>> Please note that I have been looking at the older code, the git log is
>> follow.
>> ************************************************************
>> ***********************************
>> commit 755da9160cd91ea5ebcc752603ce1bffb74a75a1 (HEAD -> master,
>> origin/master, origin/HEAD)
>> Author: Kuai Yu <yukuai518@gmail.com>
>> Date:   Tue Apr 11 19:10:53 2017 -0700
>> ************************************************************
>> ***********************************
>> Thanks,
>> Vicky

View raw message