Return-Path: X-Original-To: apmail-camel-issues-archive@minotaur.apache.org Delivered-To: apmail-camel-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 28C92107B5 for ; Sat, 19 Oct 2013 07:10:55 +0000 (UTC) Received: (qmail 4358 invoked by uid 500); 19 Oct 2013 07:10:52 -0000 Delivered-To: apmail-camel-issues-archive@camel.apache.org Received: (qmail 4329 invoked by uid 500); 19 Oct 2013 07:10:45 -0000 Mailing-List: contact issues-help@camel.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@camel.apache.org Delivered-To: mailing list issues@camel.apache.org Received: (qmail 4318 invoked by uid 99); 19 Oct 2013 07:10:42 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 19 Oct 2013 07:10:42 +0000 Date: Sat, 19 Oct 2013 07:10:42 +0000 (UTC) From: "Claus Ibsen (JIRA)" To: issues@camel.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CAMEL-6867) camel-hdfs - HdfsProducer filename collisions when Producer instance recreated MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CAMEL-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799819#comment-13799819 ] Claus Ibsen commented on CAMEL-6867: ------------------------------------ 2.10 is no longer supported, so 2.11 onwards is fine. > camel-hdfs - HdfsProducer filename collisions when Producer instance recreated > ------------------------------------------------------------------------------ > > Key: CAMEL-6867 > URL: https://issues.apache.org/jira/browse/CAMEL-6867 > Project: Camel > Issue Type: Bug > Components: camel-hdfs > Reporter: Ben O'Day > Assignee: Ben O'Day > Fix For: 2.13.0 > > > The HdfsProducer uses an instance variable (long splitNum) that is incremented to create unique output filenames in a given directory (seg0, seg1, etc). > If the Producer instance is recreated (producer cache limit exceeded, server restart, etc), the splitNum variable is reset to 0. This results in files being overwritten when using overwrite=true mode or throwing "The file already exists" errors when using overwrite=false mode. > We should switch to using a timestamp or some other unique generator to prevent filename collisions regardless of the Producer instance lifecycle for the same hdfs directory URL... -- This message was sent by Atlassian JIRA (v6.1#6144)