[ https://issues.apache.org/jira/browse/SQOOP-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14346963#comment-14346963
]
Veena Basavaraj commented on SQOOP-1803:
----------------------------------------
Thank you [~jarcec] very nice summary on the options. We did discuss #2 counters in SQOOP-1804,
but as you have laid out it has limitations on what type of state can be stored by the mappers/reducers.
The reasons of using a persistent store such as #1 or #3, allows us to create intermediate
state across the parallel tasks. I would lay out this case much better with a code patch soon.
Having said that, at this point #4 can be a simple alternative too and we can consider more
elaborate solution as a alternative via a config parameter. .
> JobManager and Execution Engine changes: Support for a injecting and pulling out configs
and job output in connectors
> ----------------------------------------------------------------------------------------------------------------------
>
> Key: SQOOP-1803
> URL: https://issues.apache.org/jira/browse/SQOOP-1803
> Project: Sqoop
> Issue Type: Sub-task
> Reporter: Veena Basavaraj
> Assignee: Veena Basavaraj
> Fix For: 1.99.6
>
>
> The details are in the design wiki, as the implementation happens more discussions can
happen here.
> https://cwiki.apache.org/confluence/display/SQOOP/Delta+Fetch+And+Merge+Design#DeltaFetchAndMergeDesign-Howtogetoutputfromconnectortosqoop?
> The goal is to dynamically inject a IncrementalConfig instance into the FromJobConfiguration.
The current MFromConfig and MToConfig can already hold a list of configs, and a strong sentiment
was expressed to keep it as a list, why not for the first time actually make use of it and
group the incremental related configs in one config object
> This task will prepare the FromJobConfiguration from the job config data, ExtractorContext
with the relevant values from the prev job run
> This task will prepare the ToJobConfiguration from the job config data, LoaderContext
with the relevant values from the prev job run if any
> We will use DistributedCache to get State information from the Extractor and Loader out
and finally persist it into the sqoop repository depending on SQOOP-1804 once the outputcommitter
commit is called
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
|