Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 186E9200BA7 for ; Fri, 21 Oct 2016 21:55:29 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 16C7A160AE8; Fri, 21 Oct 2016 19:55:29 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 5D811160ADE for ; Fri, 21 Oct 2016 21:55:28 +0200 (CEST) Received: (qmail 5985 invoked by uid 500); 21 Oct 2016 19:55:27 -0000 Mailing-List: contact dev-help@oodt.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@oodt.apache.org Delivered-To: mailing list dev@oodt.apache.org Received: (qmail 5974 invoked by uid 99); 21 Oct 2016 19:55:27 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Oct 2016 19:55:27 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 8B0E2180519 for ; Fri, 21 Oct 2016 19:55:26 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.499 X-Spam-Level: X-Spam-Status: No, score=0.499 tagged_above=-999 required=6.31 tests=[HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id S9U-FazvAZsa for ; Fri, 21 Oct 2016 19:55:23 +0000 (UTC) Received: from mail-pf0-f177.google.com (mail-pf0-f177.google.com [209.85.192.177]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id A41CC5F1F5 for ; Fri, 21 Oct 2016 19:55:23 +0000 (UTC) Received: by mail-pf0-f177.google.com with SMTP id e6so62115102pfk.3 for ; Fri, 21 Oct 2016 12:55:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:user-agent:date:subject:from:to:message-id :thread-topic:references:in-reply-to:mime-version :content-transfer-encoding; bh=oVcl66vWI4rSgOAiJ3ydql5ubJx/AKarRxi+92UQRac=; b=OiREe3JJwik38GJQW/FWv9klT851si5QraOnPNiZTHtrX7jxhvhXm1cQ+1Rpevn97I hv4WiMZCkUBsA44S3QjHTVVpAkKGIKcO0T+RH1UTSj0zh7mNQOtxxK4Wsz+sbW0J+k7n FDizKsEoFRCllI0ThOA/iO9I1nUxtSDHEqbMvZQHTqGpUDiV5Fof4yoG2Ywerf7ktyE3 PNufz0cI2t2iuwpJeKVxr3NIASCNApuFsy32OX3/4sG2Kf2CwioLwZTGs8pNpWIqnCfK WFouPtKiiXcxx3oNyMefUYnXDCnZ79VEotrSzQJC/eIIDFVhKNknqUHewP5sZnI1ZUHA 59HQ== X-Gm-Message-State: ABUngvejodokPjrTzRricgR1Jruml74qdN+m5AVyRrtrhTkAAGl/W1VS0dSy2sCxMnSzdA== X-Received: by 10.99.149.72 with SMTP id t8mr3847114pgn.29.1477079722545; Fri, 21 Oct 2016 12:55:22 -0700 (PDT) Received: from [137.78.80.81] ([137.78.80.81]) by smtp.gmail.com with ESMTPSA id x190sm7078862pfd.20.2016.10.21.12.55.21 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Oct 2016 12:55:21 -0700 (PDT) User-Agent: Microsoft-MacOutlook/f.1a.0.160910 Date: Fri, 21 Oct 2016 12:55:20 -0700 Subject: Re: Doing a file move in the LocalDataTransferer From: Chris Mattmann To: Message-ID: <0F7354C7-23D3-4A16-861C-F618508ABBAB@gmail.com> Thread-Topic: Doing a file move in the LocalDataTransferer References: In-Reply-To: Mime-version: 1.0 Content-type: text/plain; charset="UTF-8" Content-transfer-encoding: quoted-printable archived-at: Fri, 21 Oct 2016 19:55:29 -0000 The rationale is to not make the staging area a moving target, whilst build= ing controlled access storage. Creating a copy is always more safe in the area of preserva= tion and provenance, than making everyting a moving target. I would strongly caution against doing a .moveFile() unless you add facilit= ies in LocalDataTransfer to: =20 1. Make it configurable (by default off) maybe something like org.apache.oo= dt.cas.filemgr.datatransferer.local.move=20 and settable in filemgr.properties 2. Preserve (in metadata) the original file location 3. Add locking to the addMetadata facilities and addReferences since they m= ay be called in different control flows I would like to see a design that handles the above and unit tests before += 1=E2=80=99ing such a change. On 10/21/16, 11:35 AM, "Tom Barber" wrote: Hello folks =20 I'm asking here just incase someone knows a reason why this is a bad id= ea: =20 We have a bunch of large files on a slow NFS mount which we're ingestin= g in bulk. Its using the LocalDataTransferer to do the ingestion move and in that code all the move calls are really file copies. =20 As you'll all know in doing a copy the drive is actually writing bits t= o disk, where as doing a move is just moving the file pointer. =20 Is there a reason why the moveFile method actually uses FileUtils.copyFile() before I try FileUtils.moveFile() ? =20 Because 1 min for a copy vs 0.06 seconds for a move is far more prefera= ble. =20 Thanks =20 Tom =20