Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C55072009C6 for ; Tue, 31 May 2016 21:09:02 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C3E5F160A44; Tue, 31 May 2016 19:09:02 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BCDBE1609AD for ; Tue, 31 May 2016 21:09:01 +0200 (CEST) Received: (qmail 72799 invoked by uid 500); 31 May 2016 19:09:00 -0000 Mailing-List: contact users-help@nifi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@nifi.apache.org Delivered-To: mailing list users@nifi.apache.org Received: (qmail 72789 invoked by uid 99); 31 May 2016 19:09:00 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 31 May 2016 19:09:00 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 79B39C0AB5 for ; Tue, 31 May 2016 19:09:00 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.179 X-Spam-Level: ** X-Spam-Status: No, score=2.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_REPLY=1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 8Jc-TAtgpG9i for ; Tue, 31 May 2016 19:08:59 +0000 (UTC) Received: from mail-qk0-f180.google.com (mail-qk0-f180.google.com [209.85.220.180]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id D2DC95FB07 for ; Tue, 31 May 2016 19:08:58 +0000 (UTC) Received: by mail-qk0-f180.google.com with SMTP id y126so152180706qke.1 for ; Tue, 31 May 2016 12:08:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:message-id:mime-version:subject:date:references:to:in-reply-to; bh=v1SSZMpcjIM7p+nDkgP36X9GGtvO/zjuUvEIXjaJuHk=; b=ulGyLn+xi9PFN+Z+O3xks1WHnf8if/SQ/V8/2VtS0IRhkfkL0Fl4/afWF36m1QXEl/ qTTkC3CLrF9TEMXyczsM04hMDKvAmBEymD6wiSMGY6wF+ISH6cYqiYKyuLHdNAA9Azli o+EW2iUt0uQQKxIj6G7DMCCal3VFv0o8BsIByep2XzJbUaStET+hoPueq79FEA9rJNS4 Qy4WTBIYPC9lSXYxK1LxfnGnasGl+5etwaAhhQohM8B8yD1i0GgPb41bUH1/nci+jwXB GlGKUeLmVenIcCJmR3KaKQWSlrm4yLYPZMiYa9Z7U5ciVFYfzLqOm926t032SQnt1gmZ g4NQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:message-id:mime-version:subject:date :references:to:in-reply-to; bh=v1SSZMpcjIM7p+nDkgP36X9GGtvO/zjuUvEIXjaJuHk=; b=U8lbkTxY09CGNsQFFMfD6+Yt9OTAjBWZ/hEI3QmI5bSmHiYVxb+czcEeQuWXVGkcYF tqppUnmaSig2EdU3yNogmqQHtflDMs/sXHSQpxbexcVHf59uFochlabqVcTRM6NKxLid 5LiEXtmTlDMif0FK6T4YL453d5KdngKwAq8hdHCWLqJ+nkF3EcJShFOWA7iWUf4pHQ3B vgTDlYyeMznYsoGL4KJ1eFhf/mhwRYVm1PYEAru1SB7zvVCEUFIFhS9r/M549eYhToJE 2nmRuHsU2ARA43leyKeTxBfHVt3yVL2fWL7vUgMwnwIDO+xVc8uOxUvRYOWsyV37Yu0V f3Ig== X-Gm-Message-State: ALyK8tLKa/kkA6SXMXz8uP20So7DoMBOrXhSRwFMAZX+G1sWyrBNbiDFE2Ctk2wczJDkYA== X-Received: by 10.237.36.56 with SMTP id r53mr32857653qtc.47.1464721737688; Tue, 31 May 2016 12:08:57 -0700 (PDT) Received: from [172.16.2.70] (wsip-98-191-72-35.dc.dc.cox.net. [98.191.72.35]) by smtp.gmail.com with ESMTPSA id 8sm600793qhi.35.2016.05.31.12.08.56 for (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 31 May 2016 12:08:56 -0700 (PDT) From: Huagen peng Content-Type: multipart/alternative; boundary="Apple-Mail=_79003420-14CD-431E-8E76-71F761FF3172" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 9.2 \(3112\)) Subject: Re: Wildcard character in the Command Argument field of the ExecuteStreamCommand processor Date: Tue, 31 May 2016 15:08:55 -0400 References: <82897D41-A1E8-41DD-80ED-08DADECEF983@gmail.com> <59D884C2-1B4E-4EB0-9819-33BB21285620@apache.org> To: users@nifi.apache.org In-Reply-To: X-Mailer: Apple Mail (2.3112) archived-at: Tue, 31 May 2016 19:09:03 -0000 --Apple-Mail=_79003420-14CD-431E-8E76-71F761FF3172 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=gb2312 Thank you for your suggestion, Andy and Lee. I am aware of the flow using ListFile-FetchFile-HashContent. I didn=A1=AFt= go for that route because the ListFile processor does not allow = upstream processor. I have an upstream processor, from which I know the = directory I want to work with. I end up to passing the directory name = into the ExecuteStreamCommand processor to get ALL the files under the = directory. After that I use SplitText and ExtractText to filter the = files with the desired file extension, and then I use FetchFile and = HashContent to finish what I want to do. If ListFile allows upstream input, it would have make my data flow much = easier. The same goes for the ListSFTP processor. Huagen > =D4=DA 2016=C4=EA5=D4=C231=C8=D5=A3=AC=CF=C2=CE=E72:56=A3=ACLee Laim = =D0=B4=B5=C0=A3=BA >=20 > Huagen, >=20 > I had a similar workflow and eventually replaced = ExecuteStreamCommand(md5sum) with HashContent. >=20 > Using ListFile->FetchFile->HashContent, the resultant hash is placed = into the flowfile under the attribute ${hash.value}. > This processor offers ~40 algorithms to choose from, including md5. = Compared to the ExecuteStreamCommand, the HashContent processor offers a = bit more in error-handling and lineage traceability in this specific = case. =20 >=20 > Thanks, > -Lee >=20 >=20 > On Tue, May 31, 2016 at 11:24 AM, Andy LoPresto > wrote: > Huagen, >=20 > The ExecuteStreamCommand is used to run a command against the contents = of an incoming flowfile. For example, you could have a ListFile = processor listing all .gz files in the directory and passing them to the = ExecuteStreamCommand processor to generate the MD5 hash of each. In this = case, you would not need a wildcard character in the command.=20 >=20 > The configuration for the processors is as follows: >=20 > ListFile: > -Input directory: > -File Filter: [^\.]\.gz >=20 > ExecuteStreamCommand: > -Command arguments: ${filename} > -Command path: md5 > -Working Directory: > -Output Destination Attribute: md5hash >=20 > Notes: > -I am using =A1=B0md5=A1=B1 rather than =A1=B0md5sum=A1=B1 as I = am on Mac OS X.=20 > -You could use the =A1=B0-n=A1=B1 flag for =A1=B0md5=A1=B1 to = suppress extraneous output > -You could use =A1=B0${absolute.path}/${filename}=A1=B1 as the = command arguments, in which case you would not need to set the working = directory > =20 > Andy LoPresto > alopresto@apache.org > alopresto.apache@gmail.com > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 >=20 >> On May 31, 2016, at 7:02 AM, Huagen peng > wrote: >>=20 >> Hi, I would like to run a md5sum command on all the *.gz files under = a certain directory. However, I keep getting this error: >> md5sum: stat '/tmp/transfer/16-05-22_00/*.gz': No such file or = directory >>=20 >> I tried quoting the * wild character, adding a . dot or / in front = with no avail. Can I do something like this with the = ExecuteStreamCommand processor? >>=20 >> Thanks. >=20 >=20 --Apple-Mail=_79003420-14CD-431E-8E76-71F761FF3172 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=gb2312 Thank you for your suggestion, Andy and Lee.
I am aware of the flow using = ListFile-FetchFile-HashContent. I didn=A1=AFt go for that route because = the ListFile processor does not allow upstream processor. I have an = upstream processor, from which I know the directory I want to work with. =  I end up to passing the directory name into the = ExecuteStreamCommand processor to get ALL the files under the directory. = After that I use SplitText and ExtractText to filter the files with the = desired file extension, and then I use FetchFile and HashContent to = finish what I want to do.

If ListFile allows upstream input, it would have make my data = flow much easier.  The same goes for the ListSFTP = processor.

Huagen

=D4=DA = 2016=C4=EA5=D4=C231=C8=D5=A3=AC=CF=C2=CE=E72:56=A3=ACLee Laim <lee.laim@gmail.com> = =D0=B4=B5=C0=A3=BA

Huagen,

I had a similar workflow and eventually = replaced ExecuteStreamCommand(md5sum) with HashContent.

Using =  ListFile->FetchFile->HashContent, the resultant hash is = placed into the flowfile under the attribute ${hash.value}.
This processor offers ~40 algorithms to choose from, = including md5.   Compared to the ExecuteStreamCommand, the = HashContent processor offers a bit more in error-handling and lineage = traceability in this specific case.  

Thanks,
-Lee


On Tue, = May 31, 2016 at 11:24 AM, Andy LoPresto <alopresto@apache.org> wrote:
Huagen,

The ExecuteStreamCommand is used to run = a command against the contents of an incoming flowfile. For example, you = could have a ListFile processor listing all .gz files in the directory = and passing them to the ExecuteStreamCommand processor to generate the = MD5 hash of each. In this case, you would not need a wildcard character = in the command. 

The configuration for the processors is as follows:

ListFile:
= -Input directory: <the directory where the files are = located>
-File Filter: [^\.]\.gz

ExecuteStreamCommand:
= -Command arguments: ${filename}
-Command path: = md5
= -Working Directory: <the directory where the files are = located>
-Output Destination Attribute: md5hash

Notes:
= -I am using =A1=B0md5=A1=B1 rather than =A1=B0md5sum=A1=B1 as I = am on Mac OS X. 
-You could use = the =A1=B0-n=A1=B1 flag for =A1=B0md5=A1=B1 to suppress extraneous = output
-You could use =A1=B0${absolute.path}/${filename}=A1= =B1 as the command arguments, in which case you would not need to set = the working directory
 
Andy LoPresto
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE = 3C6E F65B 2F7D EF69

On May 31, 2016, at 7:02 AM, Huagen peng <huagen.peng@gmail.com> wrote:

Hi, I would like to run a md5sum command on = all the *.gz files under a certain directory.  However, I keep = getting this error:
md5sum: stat = '/tmp/transfer/16-05-22_00/*.gz': No such file or directory

I tried quoting the * wild character, adding a = . dot or / in front with no avail.  Can I do something like this = with the ExecuteStreamCommand processor?

Thanks.



= --Apple-Mail=_79003420-14CD-431E-8E76-71F761FF3172--