pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Leonard (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-3770) Enhance DBStorage to make it more flexible (should batch statements?, rollback on job failure, support command line arguments etc.)
Date Thu, 04 Feb 2016 00:49:39 GMT

    [ https://issues.apache.org/jira/browse/PIG-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131467#comment-15131467
] 

Chris Leonard commented on PIG-3770:
------------------------------------

Is there any update on this or traction with it yet? We are trying to enable a data transfer
from Hadoop using PIG's STORE ... USING org.apache.pig.piggybank.storage.DBStorage command,
and it's generating dozens of threads on the MSSQL Server target, which are all bad database
citizens. They begin one transaction and then issue singleton updates, all of which acquire
locks and hold them until our server runs out of locks. Not good!

> Enhance DBStorage to make it more flexible (should batch statements?, rollback on job
failure, support command line arguments etc.)
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-3770
>                 URL: https://issues.apache.org/jira/browse/PIG-3770
>             Project: Pig
>          Issue Type: Improvement
>          Components: piggybank
>            Reporter: Nezih Yigitbasi
>            Assignee: Nezih Yigitbasi
>
> First of all, the TestDBStorage unit test is *broken*. It doesn't even run the DBStorage
store logic. I debugged it and added logs to find out that putNext is not even called. The
reason this unit test doesn't fail is that the verification loop at the end of the testWriteToDB
method that traverses the result set simply doesn't do any verification since the result set
is empty (since DBStorage store logic is not called at all) and it doesn't enter that for
loop. (If it could run it would fail as the verification logic is also broken: see that the
orders in the expNames, expRations, and expDates do not even match). This has to be fixed.
> I propose to improve DBStorage with the following changes:
> - fix the problems with the unit test described above to make it work, and make it more
comprehensive (the unit test currently only inserts three records, this test has to be made
more comprehensive)
> - use command line options in the constructor like other Pig store functions (PigStorage,
HBaseStorage, etc.) to make DBStorage more flexible. With this change it would be easy to
implement PIG-3597
> - DBStorage supports rollbacks on task failures, but *not* on job failures. This is a
nice to have feature that's requested before, see PIG-1891



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message