gobblin-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jay Sen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
Date Fri, 03 May 2019 00:36:00 GMT

    [ https://issues.apache.org/jira/browse/GOBBLIN-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832130#comment-16832130

Jay Sen commented on GOBBLIN-707:

Sure, that would be even better but that will required even further refactoring of the Java
classes for statestore-checker and others to bring them under the {{Alias}} and make it go
through {{GobblinCli}}, I will do that once you confirm the following syntax in that case: 
gobblin --help
gobblin.sh cli <cli-commands> <params>
gobblin.sh service <execution-modes> <start|stop|status>

Argument Options:
<cli-command> admin, jobs, statestore-check, statestore-clean, historystore-manager
<execution-mode> standalone, cluster-master, cluster-worker, aws, yarn, mapreduce, service-manager.

--cluster-name Name of the cluster to be used by helix & other services. ( default: gobblin_cluster).
--conf-dir <path-of-conf-dir> Gobblon config path. default is '$GOBBLIN_HOME/conf/<exe-mode-name>'.
--log4j-conf <path-of-log4j-file> default is '$GOBBLIN_HOME/conf/<exe-mode-name>/log4j.properties'.
--jvmopts <jvm or gc options> String containing JVM flags to include, in addition to
"-Xmx1g -Xms512m".
--jars <csv list of extra jars> Column-separated list of extra jars to put on the CLASSPATH.
--enable-gc-logs enables gc logs & dumps.
--show-classpath prints gobblin runtime classpath.
--jt <resource manager URL> Only for mapreduce mode: Job submission URL, if not set,
taken from ${HADOOP_HOME}/conf.
--fs <file system URL> Only for mapreduce mode: Target file system, if not set, taken
from ${HADOOP_HOME}/conf.
--help Display this help.
--verbose Display full command used to start the process.
Gobblin Version: 0.15.0

btw, all the removed scripts is been incorporated into above gobblin.sh changes in one or
other way, I will double check on that anyway.


> combine & standardize all gobblin scripts into one master script & restructure
configs accordingly
> --------------------------------------------------------------------------------------------------
>                 Key: GOBBLIN-707
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-707
>             Project: Apache Gobblin
>          Issue Type: Improvement
>            Reporter: Jay Sen
>            Priority: Major
>          Time Spent: 5h 40m
>  Remaining Estimate: 0h
> gobblin supports multiple modes of executions ( CLI, Standalone, cluster-master, cluster-worker,
AWS, YARN, MR ) and various command lines utility to run cli and admin commands. There is
a individual script for each of them.
> Having individual script introduces lot of issues
>  # all scripts handles gobblin variables, user parameters differently, and its highly
inconsistent among various different gobblin scripts
>  # functionality around start, stop, status checking and handling PID's among lot of
other things, varies vastly as per the implementation of the script.
>  # features like GC & JVM params, log4j file selection, classpath calculation, etc...
exists in some gobblin scripts but not all, adding to inconsistent user experience.
>  # maintaining total 13 script would be too much effort.
> Also all the gobblin scripts share lot of common code to handle params, start, stop services,
status checks, pid handling, etc... combining all the scripts into  1 not only makes maintenance
easier but also brings clarity and consistency.
> Solution:
> 1. there can be one gobblin.sh script to handle all gobblin commands and deployment options
as per following signature. NOTE: This
> {{gobblin.sh  <command> <params>}}
>  {{gobblin.sh  <execution-mode> <start|stop|status>}}
> {{commands values: admin, cli, statestore-check, statestore-clean, historystore-manager,
>  {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, service}}
> with above change, following becomes valid command.
> {code:java}
> # all under GobblinCli class
> gobblin run listQuickApps  –> gobblin cli run listQuickApps
> gobblin run listQuickApps  –> gobblin cli run listQuickApps
> gobblin run <quick-app-name> -> gobblin cli run <quick-app-name>
> # class: JobStateToJsonConverter
> statestore-checker.sh <args> -> gobblin statestore-checker <args>
> # class: StateStoreCleaner
> statestore-clean.sh <args> -> gobblin statestore-clean <args>
> # class: DatabaseJobHistoryStoreSchemaManager
> historystore-manager.sh <args> -> gobblin historystore-manager <args>
> # class: Cli
> gobblin-admin.sh <args>   -> gobblin admin <args>
> # all gobblin deployment modes
> gobblin-cluster-master.sh   -> gobblin cluster-mater start|stop|status
> gobblin-cluster-worker.sh   -> gobblin cluster-mater start|stop|status
> gobblin-compaction.sh       -> gobblin cluster-mater start|stop|status
> gobblin-env.sh              -> gobblin cluster-mater start|stop|status
> gobblin-mapreduce.sh        -> gobblin cluster-mater start|stop|status
> gobblin-service.sh          -> gobblin cluster-mater start|stop|status
> gobblin-standalone.sh       -> gobblin cluster-mater start|stop|status
> gobblin-yarn.sh             -> gobblin cluster-mater start|stop|status
> {code}
> 2. Also configs needs to be structured and deduped accordingly to make it clear on which
config will be picked up for which execution mode.
>  {color:#FF0000}
>  NOTE: this refactoring to gobblin.sh, changes the way all gobblin commands where ran

This message was sent by Atlassian JIRA

View raw message