Return-Path: X-Original-To: apmail-pig-dev-archive@www.apache.org Delivered-To: apmail-pig-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 41E05DEDA for ; Sat, 21 Jul 2012 00:07:36 +0000 (UTC) Received: (qmail 2620 invoked by uid 500); 21 Jul 2012 00:07:35 -0000 Delivered-To: apmail-pig-dev-archive@pig.apache.org Received: (qmail 2317 invoked by uid 500); 21 Jul 2012 00:07:35 -0000 Mailing-List: contact dev-help@pig.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pig.apache.org Delivered-To: mailing list dev@pig.apache.org Received: (qmail 2291 invoked by uid 500); 21 Jul 2012 00:07:35 -0000 Delivered-To: apmail-hadoop-pig-dev@hadoop.apache.org Received: (qmail 2285 invoked by uid 99); 21 Jul 2012 00:07:35 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 Jul 2012 00:07:35 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 299A314285A for ; Sat, 21 Jul 2012 00:07:35 +0000 (UTC) Date: Sat, 21 Jul 2012 00:07:35 +0000 (UTC) From: "Eli Reisman (JIRA)" To: pig-dev@hadoop.apache.org Message-ID: <308012796.84905.1342829255172.JavaMail.jiratomcat@issues-vm> In-Reply-To: <37184176.9530.1299712859922.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (PIG-1891) Enable StoreFunc to make intelligent decision based on job success or failure MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PIG-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Reisman updated PIG-1891: ----------------------------- Attachment: PIG-1891-1.patch > Enable StoreFunc to make intelligent decision based on job success or failure > ----------------------------------------------------------------------------- > > Key: PIG-1891 > URL: https://issues.apache.org/jira/browse/PIG-1891 > Project: Pig > Issue Type: New Feature > Affects Versions: 0.10.0 > Reporter: Alex Rovner > Priority: Minor > Labels: patch > Attachments: PIG-1891-1.patch > > > We are in the process of using PIG for various data processing and component integration. Here is where we feel pig storage funcs lack: > They are not aware if the over all job has succeeded. This creates a problem for storage funcs which needs to "upload" results into another system: > DB, FTP, another file system etc. > I looked at the DBStorage in the piggybank (http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/DBStorage.java?view=markup) and what I see is essentially a mechanism which for each task does the following: > 1. Creates a recordwriter (in this case open connection to db) > 2. Open transaction. > 3. Writes records into a batch > 4. Executes commit or rollback depending if the task was successful. > While this aproach works great on a task level, it does not work at all on a job level. > If certain tasks will succeed but over job will fail, partial records are going to get uploaded into the DB. > Any ideas on the workaround? > Our current workaround is fairly ugly: We created a java wrapper that launches pig jobs and then uploads to DB's once pig's job is successful. While the approach works, it's not really integrated into pig. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira