apex-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mukkamula, Suryavamshivardhan (CWM-NR)" <suryavamshivardhan.mukkam...@rbc.com>
Subject RE: Multiple directories
Date Wed, 25 May 2016 15:17:30 GMT
Hello Ram/Team,

My requirement is to read input feeds from different locations on HDFS and parse those files
by reading XML configuration files (each input feed has configuration file which defines the
fields inside the input feeds).

My approach : I would like to define a mapping file which contains individual feed identifier,
feed location , configuration file location. I would like to read this mapping file at initial
load within setup() method and define my DirectoryScan.acceptFiles. Here my challenge is when
I read the files , I should parse the lines by reading the individual configuration files.
How do I know the line is from particular file , if I know this I can read the corresponding
configuration file before parsing the line.

Please let me know how do I handle this.

Regards,
Surya Vamshi

From: Munagala Ramanath [mailto:ram@datatorrent.com]
Sent: 2016, May, 24 5:49 PM
To: Mukkamula, Suryavamshivardhan (CWM-NR)
Subject: Multiple directories

One way of addressing the issue is to use some sort of external tool (like a script) to
copy all the input files to a common directory (making sure that the file names are
unique to prevent one file from overwriting another) before the Apex application starts.

The Apex application then starts and processes files from this directory.

If you set the partition count of the file input operator to N, it will create N partitions
and
the files will be automatically distributed among the partitions. The partitions will work
in parallel.

Ram
_______________________________________________________________________

This [email] may be privileged and/or confidential, and the sender does not waive any related
rights and obligations. Any distribution, use or copying of this [email] or the information
it contains by other than an intended recipient is unauthorized. If you received this [email]
in error, please advise the sender (by return [email] or otherwise) immediately. You have
consented to receive the attached electronically at the above-noted address; please retain
a copy of this confirmation for future reference.
Mime
View raw message