apex-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Weise <thomas.we...@gmail.com>
Subject Re: Multiple directories
Date Sat, 18 Jun 2016 04:03:19 GMT
Please check in the Apex application master log (container 1), how much
memory it is requesting. If that's the correct figure and still you end up
with a larger container, the problem could be the minimum container size in
the YARN scheduler configuration.


On Fri, Jun 17, 2016 at 12:58 PM, Mukkamula, Suryavamshivardhan (CWM-NR) <
suryavamshivardhan.mukkamula@rbc.com> wrote:

> Hi Ram,
>
>
>
> I tried that option of adding the memory properties in site/conf and
> selected during the launch , but no luck. The same is working with my local
> sandbox set up.
>
>
>
> Is there any other way that I can understand the reason?
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Munagala Ramanath [mailto:ram@datatorrent.com]
> *Sent:* 2016, June, 17 3:06 PM
> *To:* users@apex.apache.org
> *Subject:* Re: Multiple directories
>
>
>
> Please take a look at the section entitled "Properties source precedence"
> at
>
> http://docs.datatorrent.com/application_packages/
>
>
>
> It looks like the setting in dt-site.xml on the cluster is overriding your
> application defined values.
>
> If you add the properties to file under site/conf in your application and
> then
>
> select it during launch, those values should take effect.
>
>
>
> For signalling EOF, another option is to use a separate control port to
> send the EOF which could
>
> just be the string "EOF" for example.
>
>
>
>
>
> Ram
>
>
>
> On Fri, Jun 17, 2016 at 11:48 AM, Mukkamula, Suryavamshivardhan (CWM-NR) <
> suryavamshivardhan.mukkamula@rbc.com> wrote:
>
> Hi,
>
>
>
> I found a way for the 2nd question.
>
>
>
> 1)      I tried setting operator memory to 500MB but still the operator
> is taking 4GB by default , I did not understand why? (In my sandbox set up
> memory was set correctly to 500MB, but not on enterprise dev cluster)
>
>
>
> 2)      I am trying to send a null object from file reader when EOF is
> reached so that file writer can call requestFinalize() Method. But
> somehow I could not figure out how to send the EOF, I tried like below but
> no luck.(Solution: It seems readEntity() method if returns null
> emitTuple() method is not getting called, I have managed to emit the null
> object from readEntity() itself)
>
>
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Mukkamula, Suryavamshivardhan (CWM-NR) [mailto:
> suryavamshivardhan.mukkamula@rbc.com]
> *Sent:* 2016, June, 17 12:20 PM
> *To:* users@apex.apache.org
> *Subject:* RE: Multiple directories
>
>
>
> Hi,
>
>
>
> Can you please help me understand the below issues.
>
>
>
> 1)      I tried setting operator memory to 500MB but still the operator
> is taking 4GB by default , I did not understand why? (In my sandbox set up
> memory was set correctly to 500MB, but not on enterprise dev cluster)
>
> 2)      I am trying to send a null object from file reader when EOF is
> reached so that file writer can call requestFinalize() Method. But
> somehow I could not figure out how to send the EOF, I tried like below but
> no luck.
>
>
>
> ################### File Reader
> ####################################################
>
>
>
> @Override
>
>                 protected String readEntity() throws IOException {
>
>                                 // try to read a line
>
>                                 final String line = br.readLine();
>
>                                 if (null != line) { // normal case
>
>                                                 LOG.debug("readEntity:
> line = {}", line);
>
>                                                 return line;
>
>                                 }
>
>
>
>                                 // end-of-file (control tuple sent in
> closeFile()
>
>                                 LOG.info("readEntity: EOF for {}",
> filePath);
>
>                                 return null;
>
>                 }
>
>
>
>                 @Override
>
>                 protected void emit(String line) {
>
>                                 // parsing logic here, parse the line as
> per the input configuration and
>
>                                 // create the output line as per the
> output configuration
>
>                                 if(line == null){
>
>                                                 output.emit(new
> KeyValue<String,String>(getFileName(),null));
>
>                                 }
>
>                                 KeyValue<String, String> tuple = new
> KeyValue<String, String>();
>
>                                 tuple.key = getFileName();
>
>                                 tuple.value = line;
>
>                                 KeyValue<String, String> newTuple =
> parseTuple(tuple);
>
>                                 output.emit(newTuple);
>
>                 }
>
> ######################File
> Writer######################################################
>
> public class FileOutputOperator extends
> AbstractFileOutputOperator<KeyValue<String, String>> {
>
>     private static final Logger LOG =
> LoggerFactory.getLogger(FileOutputOperator.class);
>
>     private List<String> filesToFinalize = new ArrayList<>();
>
>
>
>     @Override
>
>     public void setup(Context.OperatorContext context) {
>
>         super.setup(context);
>
>         finalizeFiles();
>
>     }
>
>
>
>     @Override
>
>     protected byte[] getBytesForTuple(KeyValue<String, String> tuple) {
>
>         if (tuple.value == null) {
>
>                 LOG.info("File to finalize {}",tuple.key);
>
>             filesToFinalize.add(tuple.key);
>
>             return new byte[0];
>
>         }
>
>         else {
>
>             return tuple.value.getBytes();
>
>         }
>
>     }
>
>
>
>     @Override
>
>     protected String getFileName(KeyValue<String, String> tuple) {
>
>         return tuple.key;
>
>     }
>
>
>
>     @Override
>
>     public void endWindow() {
>
>         super.endWindow();
>
>         finalizeFiles();
>
>     }
>
>
>
>     private void finalizeFiles() {
>
>                 LOG.info("Files to finalize {}",filesToFinalize.toArray());
>
>         Iterator<String> fileIt = filesToFinalize.iterator();
>
>         while(fileIt.hasNext()) {
>
>             requestFinalize(fileIt.next());
>
>             fileIt.remove();
>
>         }
>
>     }
>
> }
>
>
> ##################################################################################################
>
>
>
>
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Mukkamula, Suryavamshivardhan (CWM-NR) [
> mailto:suryavamshivardhan.mukkamula@rbc.com
> <suryavamshivardhan.mukkamula@rbc.com>]
> *Sent:* 2016, June, 17 9:11 AM
> *To:* users@apex.apache.org
> *Subject:* RE: Multiple directories
>
>
>
> Hi Ram/Raja,
>
>
>
> Hbase dependency was creating older Hadoop jars in my classpath. I removed
> the Hbase dependency which I don’t need for now and issue got resolved.
>
>
>
> Thank you for your help.
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Raja.Aravapalli [mailto:Raja.Aravapalli@target.com
> <Raja.Aravapalli@target.com>]
> *Sent:* 2016, June, 17 7:06 AM
> *To:* users@apex.apache.org
> *Subject:* Re: Multiple directories
>
>
>
>
>
> I also, faced similar problem with Hadoop jars, when I used HBase jars in
> pom.xml!! It could be because of version of hadoop jars your apa is holding
> different than the one’s in the cluster!!
>
>
>
>
>
> What I did to solve this is,
>
>
>
> Included the scope provided in Maven pom.xml for hbase jars, and then
> provided the hbase jars to application package during submission using
>  "-libjars” with launch command. Which solved my Invalid Container Id
> problem!!
>
>
>
> You can type “launch help” for learning on usage details.
>
>
>
>
>
> Regards,
>
> Raja.
>
>
>
> *From: *Munagala Ramanath <ram@datatorrent.com>
> *Reply-To: *"users@apex.apache.org" <users@apex.apache.org>
> *Date: *Thursday, June 16, 2016 at 4:10 PM
> *To: *"users@apex.apache.org" <users@apex.apache.org>
> *Subject: *Re: Multiple directories
>
>
>
> Those 6 hadoop jars are definitely a problem.
>
>
>
> I didn't see the output of "*mvn dependency:tree*"; could you post that ?
>
> It will show you why these hadoop jars are being pulled in.
>
>
>
> Also, please refer to the section "Hadoop dependencies conflicts" in the
> troubleshooting guide:
>
> http://docs.datatorrent.com/troubleshooting/
>
>
>
> Ram
>
>
>
> On Thu, Jun 16, 2016 at 1:56 PM, Mukkamula, Suryavamshivardhan (CWM-NR) <
> suryavamshivardhan.mukkamula@rbc.com> wrote:
>
> Hi Ram,
>
>
>
> Below are the details.
>
>
>
>
>
> 0 Wed Jun 15 16:34:24 EDT 2016 META-INF/
>
>    358 Wed Jun 15 16:34:22 EDT 2016 META-INF/MANIFEST.MF
>
>      0 Wed Jun 15 16:34:24 EDT 2016 app/
>
> 52967 Wed Jun 15 16:34:22 EDT 2016 app/countrynamescan-1.0-SNAPSHOT.jar
>
>      0 Wed Jun 15 16:34:22 EDT 2016 lib/
>
> 62983 Wed Jun 15 16:34:22 EDT 2016 lib/activation-1.1.jar
>
> 1143233 Wed Jun 15 16:34:22 EDT 2016 lib/activemq-client-5.8.0.jar
>
>   4467 Wed Jun 15 16:34:22 EDT 2016 lib/aopalliance-1.0.jar
>
> 44925 Wed Jun 15 16:34:22 EDT 2016 lib/apacheds-i18n-2.0.0-M15.jar
>
> 691479 Wed Jun 15 16:34:22 EDT 2016
> lib/apacheds-kerberos-codec-2.0.0-M15.jar
>
> 16560 Wed Jun 15 16:34:22 EDT 2016 lib/api-asn1-api-1.0.0-M20.jar
>
> 79912 Wed Jun 15 16:34:22 EDT 2016 lib/api-util-1.0.0-M20.jar
>
> 43033 Wed Jun 15 16:34:22 EDT 2016 lib/asm-3.1.jar
>
> 303139 Wed Jun 15 16:34:22 EDT 2016 lib/avro-1.7.4.jar
>
> 232019 Wed Jun 15 16:34:22 EDT 2016 lib/commons-beanutils-1.8.3.jar
>
> 41123 Wed Jun 15 16:34:22 EDT 2016 lib/commons-cli-1.2.jar
>
> 284184 Wed Jun 15 16:34:22 EDT 2016 lib/commons-codec-1.10.jar
>
> 575389 Wed Jun 15 16:34:22 EDT 2016 lib/commons-collections-3.2.1.jar
>
> 30595 Wed Jun 15 16:34:22 EDT 2016 lib/commons-compiler-2.7.8.jar
>
> 241367 Wed Jun 15 16:34:22 EDT 2016 lib/commons-compress-1.4.1.jar
>
> 298829 Wed Jun 15 16:34:22 EDT 2016 lib/commons-configuration-1.6.jar
>
> 143602 Wed Jun 15 16:34:22 EDT 2016 lib/commons-digester-1.8.jar
>
> 112341 Wed Jun 15 16:34:22 EDT 2016 lib/commons-el-1.0.jar
>
> 305001 Wed Jun 15 16:34:22 EDT 2016 lib/commons-httpclient-3.1.jar
>
> 185140 Wed Jun 15 16:34:22 EDT 2016 lib/commons-io-2.4.jar
>
> 284220 Wed Jun 15 16:34:22 EDT 2016 lib/commons-lang-2.6.jar
>
> 315805 Wed Jun 15 16:34:22 EDT 2016 lib/commons-lang3-3.1.jar
>
> 61829 Wed Jun 15 16:34:22 EDT 2016 lib/commons-logging-1.2.jar
>
> 1599627 Wed Jun 15 16:34:22 EDT 2016 lib/commons-math3-3.1.1.jar
>
> 273370 Wed Jun 15 16:34:22 EDT 2016 lib/commons-net-3.1.jar
>
> 3608597 Wed Jun 15 16:34:22 EDT 2016 lib/db2jcc-123.jar
>
> 313898 Wed Jun 15 16:34:22 EDT 2016 lib/dom4j-1.6.1.jar
>
> 17138265 Wed Jun 15 16:34:22 EDT 2016 lib/fastutil-6.6.4.jar
>
> 15322 Wed Jun 15 16:34:22 EDT 2016 lib/findbugs-annotations-1.3.9-1.jar
>
> 84946 Wed Jun 15 16:34:22 EDT 2016 lib/flatpack-3.4.2.jar
>
> 20220 Wed Jun 15 16:34:22 EDT 2016
> lib/geronimo-j2ee-management_1.1_spec-1.0.1
>
> jar
>
> 32359 Wed Jun 15 16:34:22 EDT 2016 lib/geronimo-jms_1.1_spec-1.1.1.jar
>
> 21817 Wed Jun 15 16:34:22 EDT 2016 lib/gmbal-api-only-3.0.0-b023.jar
>
> 690573 Wed Jun 15 16:34:22 EDT 2016 lib/grizzly-framework-2.1.2.jar
>
> 253086 Wed Jun 15 16:34:22 EDT 2016 lib/grizzly-http-2.1.2.jar
>
> 198255 Wed Jun 15 16:34:22 EDT 2016 lib/grizzly-http-server-2.1.2.jar
>
> 336904 Wed Jun 15 16:34:22 EDT 2016 lib/grizzly-http-servlet-2.1.2.jar
>
>   8114 Wed Jun 15 16:34:22 EDT 2016 lib/grizzly-rcm-2.1.2.jar
>
> 1795932 Wed Jun 15 16:34:22 EDT 2016 lib/guava-12.0.1.jar
>
> 710492 Wed Jun 15 16:34:22 EDT 2016 lib/guice-3.0.jar
>
> 65012 Wed Jun 15 16:34:22 EDT 2016 lib/guice-servlet-3.0.jar
>
> 16778 Wed Jun 15 16:34:22 EDT 2016 lib/hadoop-annotations-2.2.0.jar
>
> 52449 Wed Jun 15 16:34:22 EDT 2016 lib/hadoop-auth-2.5.1.jar
>
> 2962685 Wed Jun 15 16:34:22 EDT 2016 lib/hadoop-common-2.5.1.jar
>
> 1498368 Wed Jun 15 16:34:22 EDT 2016
> lib/hadoop-mapreduce-client-core-2.5.1.jar
>
> 1158936 Wed Jun 15 16:34:22 EDT 2016 lib/hadoop-yarn-api-2.2.0.jar
>
> 1301627 Wed Jun 15 16:34:22 EDT 2016 lib/hadoop-yarn-common-2.2.0.jar
>
> 50139 Wed Jun 15 16:34:22 EDT 2016 lib/hawtbuf-1.9.jar
>
> 20780 Wed Jun 15 16:34:22 EDT 2016 lib/hbase-annotations-1.1.2.jar
>
> 1249004 Wed Jun 15 16:34:22 EDT 2016 lib/hbase-client-1.1.2.jar
>
> 530078 Wed Jun 15 16:34:22 EDT 2016 lib/hbase-common-1.1.2.jar
>
> 4201685 Wed Jun 15 16:34:22 EDT 2016 lib/hbase-protocol-1.1.2.jar
>
> 1475955 Wed Jun 15 16:34:22 EDT 2016 lib/htrace-core-3.1.0-incubating.jar
>
> 590533 Wed Jun 15 16:34:22 EDT 2016 lib/httpclient-4.3.5.jar
>
> 282269 Wed Jun 15 16:34:22 EDT 2016 lib/httpcore-4.3.2.jar
>
> 228286 Wed Jun 15 16:34:22 EDT 2016 lib/jackson-core-asl-1.9.2.jar
>
> 765648 Wed Jun 15 16:34:22 EDT 2016 lib/jackson-mapper-asl-1.9.2.jar
>
>   2497 Wed Jun 15 16:34:22 EDT 2016 lib/javax.inject-1.jar
>
> 521991 Wed Jun 15 16:34:22 EDT 2016 lib/javax.mail-1.5.0.jar
>
> 83945 Wed Jun 15 16:34:22 EDT 2016 lib/javax.servlet-3.1.jar
>
> 85353 Wed Jun 15 16:34:22 EDT 2016 lib/javax.servlet-api-3.0.1.jar
>
> 105134 Wed Jun 15 16:34:22 EDT 2016 lib/jaxb-api-2.2.2.jar
>
> 890168 Wed Jun 15 16:34:22 EDT 2016 lib/jaxb-impl-2.2.3-1.jar
>
> 1291164 Wed Jun 15 16:34:22 EDT 2016 lib/jcodings-1.0.8.jar
>
> 151304 Wed Jun 15 16:34:22 EDT 2016 lib/jdom-1.1.3.jar
>
> 130458 Wed Jun 15 16:34:22 EDT 2016 lib/jersey-client-1.9.jar
>
> 458739 Wed Jun 15 16:34:22 EDT 2016 lib/jersey-core-1.9.jar
>
> 17542 Wed Jun 15 16:34:22 EDT 2016 lib/jersey-grizzly2-1.9.jar
>
> 14786 Wed Jun 15 16:34:22 EDT 2016 lib/jersey-guice-1.9.jar
>
> 147952 Wed Jun 15 16:34:22 EDT 2016 lib/jersey-json-1.9.jar
>
> 713089 Wed Jun 15 16:34:22 EDT 2016 lib/jersey-server-1.9.jar
>
> 28100 Wed Jun 15 16:34:22 EDT 2016 lib/jersey-test-framework-core-1.9.jar
>
> 12976 Wed Jun 15 16:34:22 EDT 2016
> lib/jersey-test-framework-grizzly2-1.9.jar
>
> 67758 Wed Jun 15 16:34:22 EDT 2016 lib/jettison-1.1.jar
>
> 21144 Wed Jun 15 16:34:22 EDT 2016
> lib/jetty-continuation-8.1.10.v20130312.jar
>
> 95709 Wed Jun 15 16:34:22 EDT 2016 lib/jetty-http-8.1.10.v20130312.jar
>
> 103622 Wed Jun 15 16:34:22 EDT 2016 lib/jetty-io-8.1.10.v20130312.jar
>
> 89691 Wed Jun 15 16:34:22 EDT 2016 lib/jetty-security-8.1.10.v20130312.jar
>
> 347020 Wed Jun 15 16:34:22 EDT 2016 lib/jetty-server-8.1.10.v20130312.jar
>
> 101052 Wed Jun 15 16:34:22 EDT 2016 lib/jetty-servlet-8.1.10.v20130312.jar
>
> 177131 Wed Jun 15 16:34:22 EDT 2016 lib/jetty-util-6.1.26.jar
>
> 284903 Wed Jun 15 16:34:22 EDT 2016 lib/jetty-util-8.1.10.v20130312.jar
>
> 125928 Wed Jun 15 16:34:22 EDT 2016
> lib/jetty-websocket-8.1.10.v20130312.jar
>
> 39117 Wed Jun 15 16:34:22 EDT 2016 lib/jms-api-1.1-rev-1.jar
>
> 187292 Wed Jun 15 16:34:22 EDT 2016 lib/joni-2.1.2.jar
>
> 280205 Wed Jun 15 16:34:22 EDT 2016 lib/jsch-0.1.53.jar
>
> 33015 Wed Jun 15 16:34:22 EDT 2016 lib/jsr305-1.3.9.jar
>
> 489884 Wed Jun 15 16:34:22 EDT 2016 lib/log4j-1.2.17.jar
>
> 565387 Wed Jun 15 16:34:22 EDT 2016 lib/malhar-contrib-3.2.0-incubating.jar
>
> 1062000 Wed Jun 15 16:34:22 EDT 2016 lib/malhar-library-3.1.1.jar
>
> 42212 Wed Jun 15 16:34:22 EDT 2016 lib/management-api-3.0.0-b012.jar
>
>   8798 Wed Jun 15 16:34:22 EDT 2016 lib/named-regexp-0.2.3.jar
>
> 1206119 Wed Jun 15 16:34:22 EDT 2016 lib/netty-3.6.6.Final.jar
>
> 1779991 Wed Jun 15 16:34:22 EDT 2016 lib/netty-all-4.0.23.Final.jar
>
> 29555 Wed Jun 15 16:34:22 EDT 2016 lib/paranamer-2.3.jar
>
> 1869113 Wed Jun 15 16:34:22 EDT 2016 lib/poi-3.9.jar
>
> 936648 Wed Jun 15 16:34:22 EDT 2016 lib/poi-ooxml-3.9.jar
>
> 4802621 Wed Jun 15 16:34:22 EDT 2016 lib/poi-ooxml-schemas-3.9.jar
>
> 533455 Wed Jun 15 16:34:22 EDT 2016 lib/protobuf-java-2.5.0.jar
>
> 26084 Wed Jun 15 16:34:22 EDT 2016 lib/slf4j-api-1.7.5.jar
>
>   9943 Wed Jun 15 16:34:22 EDT 2016 lib/slf4j-log4j12-1.7.19.jar
>
> 995968 Wed Jun 15 16:34:22 EDT 2016 lib/snappy-java-1.0.4.1.jar
>
> 23346 Wed Jun 15 16:34:22 EDT 2016 lib/stax-api-1.0-2.jar
>
> 26514 Wed Jun 15 16:34:22 EDT 2016 lib/stax-api-1.0.1.jar
>
>   2455 Wed Jun 15 16:34:22 EDT 2016 lib/tdgssconfig-14.00.00.21.jar
>
> 991265 Wed Jun 15 16:34:22 EDT 2016 lib/terajdbc4-14.00.00.21.jar
>
> 758309 Wed Jun 15 16:34:22 EDT 2016 lib/vertica-jdbc-7.2.1-0.jar
>
> 109318 Wed Jun 15 16:34:22 EDT 2016 lib/xml-apis-1.0.b2.jar
>
> 2666695 Wed Jun 15 16:34:22 EDT 2016 lib/xmlbeans-2.3.0.jar
>
> 15010 Wed Jun 15 16:34:22 EDT 2016 lib/xmlenc-0.52.jar
>
> 94672 Wed Jun 15 16:34:22 EDT 2016 lib/xz-1.0.jar
>
> 792964 Wed Jun 15 16:34:22 EDT 2016 lib/zookeeper-3.4.6.jar
>
>      0 Wed Jun 15 16:34:28 EDT 2016 conf/
>
>    334 Mon Apr 04 11:18:00 EDT 2016 conf/my-app-conf1.xml
>
>   3432 Wed Jun 15 16:22:20 EDT 2016 META-INF/properties.xml
>
>
>
> <?xmlversion=*"1.0"*encoding=*"UTF-8"*?>
>
> <projectxmlns=*"http://maven.apache.org/POM/4.0.0
> <http://maven.apache.org/POM/4.0.0>"*xmlns:xsi=*"http://www.w3.org/2001/XMLSchema-instance
> <http://www.w3.org/2001/XMLSchema-instance>"*
>
>        xsi:schemaLocation=*"http://maven.apache.org/POM/4.0.0
> <http://maven.apache.org/POM/4.0.0>http://maven.apache.org/xsd/maven-4.0.0.xsd
> <http://maven.apache.org/xsd/maven-4.0.0.xsd>"*>
>
>        <modelVersion>4.0.0</modelVersion>
>
>        <groupId>com.rbc.aml.cnscan</groupId>
>
>        <version>1.0-SNAPSHOT</version>
>
>        <artifactId>*countrynamescan*</artifactId>
>
>        <packaging>jar</packaging>
>
>
>
>        <!-- change these to the appropriate values -->
>
>        <name>*countrynamescan*</name>
>
>        <description>Country and Name Scan project</description>
>
>
>
>        <properties>
>
>               <!-- change this if you desire to use a different version
> of DataTorrent -->
>
>               <datatorrent.version>3.1.1</datatorrent.version>
>
>               <datatorrent.apppackage.classpath>lib/*.jar</
> datatorrent.apppackage.classpath>
>
>        </properties>
>
>
>
>        <!-- repository to provide the DataTorrent artifacts -->
>
>        <!-- <repositories>
>
>               <repository>
>
>                      <snapshots>
>
>                            <enabled>false</enabled>
>
>                      </snapshots>
>
>                      <id>*Datatorrent*-Releases</id>
>
>                      <name>DataTorrent Release Repository</name>
>
>                      *<url>*
> https://www.datatorrent.com/maven/content/repositories/releases/</url>
>
>               </repository>
>
>               <repository>
>
>                      <releases>
>
>                            <enabled>false</enabled>
>
>                      </releases>
>
>                      <id>DataTorrent-Snapshots</id>
>
>                      <name>DataTorrent Early Access Program Snapshot
> Repository</name>
>
>                      *<url>*
> https://www.datatorrent.com/maven/content/repositories/snapshots/</url>
>
>               </repository>
>
>        </repositories> -->
>
>
>
>
>
>        <build>
>
>               <plugins>
>
>                      <plugin>
>
>                            <groupId>org.apache.maven.plugins</groupId>
>
>                            <artifactId>*maven*-eclipse-*plugin*</
> artifactId>
>
>                            <version>2.9</version>
>
>                            <configuration>
>
>                                   <downloadSources>true</downloadSources>
>
>                            </configuration>
>
>                      </plugin>
>
>                      <plugin>
>
>                            <artifactId>*maven*-compiler-*plugin*</
> artifactId>
>
>                            <version>3.3</version>
>
>                            <configuration>
>
>                                   <encoding>UTF-8</encoding>
>
>                                   <source>1.7</source>
>
>                                   <target>1.7</target>
>
>                                   <debug>true</debug>
>
>                                   <optimize>false</optimize>
>
>                                   <showDeprecation>true</showDeprecation>
>
>                                   <showWarnings>true</showWarnings>
>
>                            </configuration>
>
>                      </plugin>
>
>                      <plugin>
>
>                            <artifactId>*maven*-dependency-*plugin*</
> artifactId>
>
>                            <version>2.8</version>
>
>                            <executions>
>
>                                   <execution>
>
>                                          <id>copy-dependencies</id>
>
>                                          <phase>prepare-package</phase>
>
>                                          <goals>
>
>                                                 <goal>copy-dependencies</
> goal>
>
>                                          </goals>
>
>                                          <configuration>
>
>                                                 <outputDirectory>
> target/deps</outputDirectory>
>
>                                                 <includeScope>runtime</
> includeScope>
>
>                                          </configuration>
>
>                                   </execution>
>
>                            </executions>
>
>                      </plugin>
>
>
>
>                      <plugin>
>
>                            <artifactId>*maven*-assembly-*plugin*</
> artifactId>
>
>                            <executions>
>
>                                   <execution>
>
>                                          <id>*app*-package-assembly</id>
>
>                                          <phase>package</phase>
>
>                                          <goals>
>
>                                                 <goal>single</goal>
>
>                                          </goals>
>
>                                          <configuration>
>
>                                                 <finalName>
> ${project.artifactId}-${project.version}-apexapp</finalName>
>
>                                                 <appendAssemblyId>false</
> appendAssemblyId>
>
>                                                 <descriptors>
>
>                                                        <descriptor>
> src/assemble/appPackage.xml</descriptor>
>
>                                                 </descriptors>
>
>                                                 <archiverConfig>
>
>                                                        <
> defaultDirectoryMode>0755</defaultDirectoryMode>
>
>                                                 </archiverConfig>
>
>                                                 <archive>
>
>                                                        <manifestEntries>
>
>                                                               <Class-Path>
> ${datatorrent.apppackage.classpath}</Class-Path>
>
>                                                               <
> DT-Engine-Version>${datatorrent.version}</DT-Engine-Version>
>
>                                                               <
> DT-App-Package-Name>${project.artifactId}</DT-App-Package-Name>
>
>                                                               <
> DT-App-Package-Version>${project.version}</DT-App-Package-Version>
>
>                                                               <
> DT-App-Package-Display-Name>${project.name}</DT-App-Package-Display-Name>
>
>                                                               <
> DT-App-Package-Description>${project.description}</
> DT-App-Package-Description>
>
>                                                       </manifestEntries>
>
>                                                 </archive>
>
>                                          </configuration>
>
>                                   </execution>
>
>                            </executions>
>
>                      </plugin>
>
>
>
>                      <plugin>
>
>                            <artifactId>*maven*-*antrun*-*plugin*</
> artifactId>
>
>                            <version>1.7</version>
>
>                            <executions>
>
>                                   <execution>
>
>                                          <phase>package</phase>
>
>                                          <configuration>
>
>                                                 <target>
>
>                                                        <move
>
>                                                               file=
> *"${project.build.directory}/${project.artifactId}-${project.version}-apexapp.jar"*
>
>                                                               tofile=
> *"${project.build.directory}/${project.artifactId}-${project.version}.apa"*
> />
>
>                                                 </target>
>
>                                          </configuration>
>
>                                          <goals>
>
>                                                 <goal>run</goal>
>
>                                          </goals>
>
>                                   </execution>
>
>                                   <execution>
>
>                                          <!-- create resource directory
> for *xml* *javadoc* -->
>
>                                          <id>createJavadocDirectory</id>
>
>                                          <phase>generate-resources</phase>
>
>                                          <configuration>
>
>                                                 <tasks>
>
>                                                        <delete
>
>                                                               dir=
> *"${project.build.directory}/generated-resources/xml-javadoc"*/>
>
>                                                        <mkdir
>
>                                                               dir=
> *"${project.build.directory}/generated-resources/xml-javadoc"*/>
>
>                                                 </tasks>
>
>                                          </configuration>
>
>                                          <goals>
>
>                                                 <goal>run</goal>
>
>                                          </goals>
>
>                                   </execution>
>
>                            </executions>
>
>                      </plugin>
>
>
>
>                      <plugin>
>
>                            <groupId>org.codehaus.mojo</groupId>
>
>                            <artifactId>build-helper-*maven*-*plugin*</
> artifactId>
>
>                            <version>1.9.1</version>
>
>                            <executions>
>
>                                   <execution>
>
>                                          <id>attach-artifacts</id>
>
>                                          <phase>package</phase>
>
>                                          <goals>
>
>                                                 <goal>attach-artifact</
> goal>
>
>                                          </goals>
>
>                                          <configuration>
>
>                                                 <artifacts>
>
>                                                        <artifact>
>
>                                                               <file>
> target/${project.artifactId}-${project.version}.apa</file>
>
>                                                               <type>apa</
> type>
>
>                                                        </artifact>
>
>                                                 </artifacts>
>
>                                                 <skipAttach>false</
> skipAttach>
>
>                                          </configuration>
>
>                                   </execution>
>
>                            </executions>
>
>                      </plugin>
>
>
>
>                      <!-- generate *javdoc* -->
>
>                      <plugin>
>
>                            <groupId>org.apache.maven.plugins</groupId>
>
>                            <artifactId>*maven*-*javadoc*-*plugin*</
> artifactId>
>
>                            <executions>
>
>                                   <!-- generate *xml* *javadoc* -->
>
>                                   <execution>
>
>                                          <id>*xml*-*doclet*</id>
>
>                                          <phase>generate-resources</phase>
>
>                                          <goals>
>
>                                                 <goal>*javadoc*</goal>
>
>                                          </goals>
>
>                                          <configuration>
>
>                                                 <doclet>
> com.github.markusbernhardt.xmldoclet.XmlDoclet</doclet>
>
>                                                 <additionalparam>-d
>
>
> ${project.build.directory}/generated-resources/xml-javadoc
>
>                                                        -filename
> ${project.artifactId}-${project.version}-javadoc.xml</additionalparam>
>
>                                                 <useStandardDocletOptions>
> false</useStandardDocletOptions>
>
>                                                 <docletArtifact>
>
>                                                        <groupId>
> com.github.markusbernhardt</groupId>
>
>                                                        <artifactId>
> xml-doclet</artifactId>
>
>                                                        <version>1.0.4</
> version>
>
>                                                 </docletArtifact>
>
>                                          </configuration>
>
>                                   </execution>
>
>                            </executions>
>
>                      </plugin>
>
>                      <!-- Transform *xml* *javadoc* to stripped down
> version containing only class/interface
>
>                            comments and tags -->
>
>                      <plugin>
>
>                            <groupId>org.codehaus.mojo</groupId>
>
>                            <artifactId>*xml*-*maven*-*plugin*</artifactId>
>
>                            <version>1.0</version>
>
>                            <executions>
>
>                                   <execution>
>
>                                          <id>transform-*xmljavadoc*</id>
>
>                                          <phase>generate-resources</phase>
>
>                                          <goals>
>
>                                                 <goal>transform</goal>
>
>                                          </goals>
>
>                                   </execution>
>
>                            </executions>
>
>                            <configuration>
>
>                                   <transformationSets>
>
>                                          <transformationSet>
>
>                                                 <dir>
> ${project.build.directory}/generated-resources/xml-javadoc</dir>
>
>                                                 <includes>
>
>                                                        <include>
> ${project.artifactId}-${project.version}-javadoc.xml</include>
>
>                                                 </includes>
>
>                                                 <stylesheet>
> XmlJavadocCommentsExtractor.xsl</stylesheet>
>
>                                                 <outputDir>
> ${project.build.directory}/generated-resources/xml-javadoc</outputDir>
>
>                                          </transformationSet>
>
>                                   </transformationSets>
>
>                            </configuration>
>
>                      </plugin>
>
>                      <!-- copy *xml* *javadoc* to class jar -->
>
>                      <plugin>
>
>                            <artifactId>*maven*-resources-*plugin*</
> artifactId>
>
>                            <version>2.6</version>
>
>                            <executions>
>
>                                   <execution>
>
>                                          <id>copy-resources</id>
>
>                                          <phase>process-resources</phase>
>
>                                          <goals>
>
>                                                 <goal>copy-resources</goal
> >
>
>                                          </goals>
>
>                                          <configuration>
>
>                                                 <outputDirectory>
> ${basedir}/target/classes</outputDirectory>
>
>                                                 <resources>
>
>                                                        <resource>
>
>                                                               <directory>
> ${project.build.directory}/generated-resources/xml-javadoc</directory>
>
>                                                               <includes>
>
>                                                                      <
> include>${project.artifactId}-${project.version}-javadoc.xml</include>
>
>                                                               </includes>
>
>                                                               <filtering>
> true</filtering>
>
>                                                        </resource>
>
>                                                 </resources>
>
>                                          </configuration>
>
>                                   </execution>
>
>                            </executions>
>
>                      </plugin>
>
>
>
>               </plugins>
>
>
>
>               <pluginManagement>
>
>                      <plugins>
>
>                            <!--This plugin's configuration is used to
> store Eclipse m2e settings
>
>                                   only. It has no influence on the *Maven*
> build itself. -->
>
>                            <plugin>
>
>                                   <groupId>org.eclipse.m2e</groupId>
>
>                                   <artifactId>*lifecycle*-mapping</
> artifactId>
>
>                                   <version>1.0.0</version>
>
>                                   <configuration>
>
>                                          <lifecycleMappingMetadata>
>
>                                                 <pluginExecutions>
>
>                                                        <pluginExecution>
>
>                                                               <
> pluginExecutionFilter>
>
>                                                                      <
> groupId>org.codehaus.mojo</groupId>
>
>                                                                      <
> artifactId>
>
>
> xml-maven-plugin
>
>                                                                      </
> artifactId>
>
>                                                                      <
> versionRange>[1.0,)</versionRange>
>
>                                                                      <
> goals>
>
>
> <goal>transform</goal>
>
>                                                                      </
> goals>
>
>                                                               </
> pluginExecutionFilter>
>
>                                                               <action>
>
>                                                                      <
> ignore></ignore>
>
>                                                               </action>
>
>                                                        </pluginExecution>
>
>                                                        <pluginExecution>
>
>                                                               <
> pluginExecutionFilter>
>
>                                                                      <
> groupId></groupId>
>
>                                                                      <
> artifactId></artifactId>
>
>                                                                      <
> versionRange>[,)</versionRange>
>
>                                                                      <
> goals>
>
>
> <goal></goal>
>
>                                                                      </
> goals>
>
>                                                               </
> pluginExecutionFilter>
>
>                                                               <action>
>
>                                                                      <
> ignore></ignore>
>
>                                                               </action>
>
>                                                        </pluginExecution>
>
>                                                 </pluginExecutions>
>
>                                          </lifecycleMappingMetadata>
>
>                                   </configuration>
>
>                            </plugin>
>
>                      </plugins>
>
>               </pluginManagement>
>
>        </build>
>
>
>
>        <dependencies>
>
>               <!-- add your dependencies here -->
>
>               <dependency>
>
>                      <groupId>com.datatorrent</groupId>
>
>                      <artifactId>*malhar*-library</artifactId>
>
>                      <version>${datatorrent.version}</version>
>
>                      <!-- If you know that your application does not need
> transitive dependencies
>
>                            pulled in by *malhar*-library, uncomment the
> following to reduce the size of
>
>                            your *app* package. -->
>
>                      <!-- <exclusions> <exclusion> <groupId>*</groupId>
> <artifactId>*</artifactId>
>
>                            </exclusion> </exclusions> -->
>
>               </dependency>
>
>               <dependency>
>
>                      <groupId>com.datatorrent</groupId>
>
>                      <artifactId>*dt*-common</artifactId>
>
>                      <version>${datatorrent.version}</version>
>
>                      <scope>provided</scope>
>
>               </dependency>
>
>               <dependency>
>
>                      <groupId>com.datatorrent</groupId>
>
>                      <artifactId>*malhar*-library</artifactId>
>
>                      <version>${datatorrent.version}</version>
>
>                      <!-- If you know that your application does not need
> transitive dependencies
>
>                            pulled in by *malhar*-library, uncomment the
> following to reduce the size of
>
>                            your *app* package. -->
>
>                      <!-- <exclusions> <exclusion> <groupId>*</groupId>
> <artifactId>*</artifactId>
>
>                            </exclusion> </exclusions> -->
>
>               </dependency>
>
>               <dependency>
>
>        <groupId>com.datatorrent</groupId>
>
>        <artifactId>*dt*-engine</artifactId>
>
>        <version>${datatorrent.version}</version>
>
>        <scope>test</scope>
>
>               </dependency>
>
>               <dependency>
>
>               <groupId>com.datatorrent</groupId>
>
>               <artifactId>*dt*-common</artifactId>
>
>               <version>${datatorrent.version}</version>
>
>               <scope>provided</scope>
>
>               </dependency>
>
>               <dependency>
>
>               <groupId>com.teradata.jdbc</groupId>
>
>               <artifactId>terajdbc4</artifactId>
>
>               <version>14.00.00.21</version>
>
>               </dependency>
>
>               <dependency>
>
>               <groupId>com.teradata.jdbc</groupId>
>
>               <artifactId>*tdgssconfig*</artifactId>
>
>               <version>14.00.00.21</version>
>
>               </dependency>
>
>               <dependency>
>
>               <groupId>com.ibm.db2</groupId>
>
>               <artifactId>db2jcc</artifactId>
>
>               <version>123</version>
>
>               </dependency>
>
>               <dependency>
>
>                      <groupId>jdk.tools</groupId>
>
>                      <artifactId>jdk.tools</artifactId>
>
>                      <version>1.7</version>
>
>                      <scope>system</scope>
>
>                      <systemPath>C:/Program Files/Java/jdk1.7.0_79/*lib*
> /tools.jar</systemPath>
>
>               </dependency>
>
>               <dependency>
>
>                      <groupId>org.apache.apex</groupId>
>
>                      <artifactId>*malhar*-*contrib*</artifactId>
>
>                      <version>3.2.0-incubating</version>
>
>                      <!--<scope>provided</scope> -->
>
>                      <exclusions>
>
>                            <exclusion>
>
>                                   <groupId>*</groupId>
>
>                                   <artifactId>*</artifactId>
>
>                            </exclusion>
>
>                      </exclusions>
>
>               </dependency>
>
>               <dependency>
>
>                      <groupId>*junit*</groupId>
>
>                      <artifactId>*junit*</artifactId>
>
>                      <version>4.10</version>
>
>                      <scope>test</scope>
>
>               </dependency>
>
>               <dependency>
>
>                      <groupId>com.vertica</groupId>
>
>                      <artifactId>*vertica*-*jdbc*</artifactId>
>
>                      <version>7.2.1-0</version>
>
>               </dependency>
>
>               <dependency>
>
>               <groupId>org.apache.hbase</groupId>
>
>               <artifactId>*hbase*-client</artifactId>
>
>               <version>1.1.2</version>
>
>               </dependency>
>
>               <dependency>
>
>                      <groupId>org.slf4j</groupId>
>
>                      <artifactId>slf4j-log4j12</artifactId>
>
>                      <version>1.7.19</version>
>
>               </dependency>
>
>               <dependency>
>
>                      <groupId>com.datatorrent</groupId>
>
>                      <artifactId>*dt*-engine</artifactId>
>
>                      <version>${datatorrent.version}</version>
>
>                      <scope>test</scope>
>
>               </dependency>
>
>
>
>               <dependency>
>
>                      <groupId>net.sf.flatpack</groupId>
>
>                      <artifactId>*flatpack*</artifactId>
>
>                      <version>3.4.2</version>
>
>               </dependency>
>
>
>
>               <dependency>
>
>                      <groupId>org.jdom</groupId>
>
>                      <artifactId>*jdom*</artifactId>
>
>                      <version>1.1.3</version>
>
>               </dependency>
>
>
>
>               <dependency>
>
>                      <groupId>org.apache.poi</groupId>
>
>                      <artifactId>*poi*-*ooxml*</artifactId>
>
>                      <version>3.9</version>
>
>               </dependency>
>
>
>
>               <dependency>
>
>                      <groupId>org.apache.xmlbeans</groupId>
>
>                      <artifactId>*xmlbeans*</artifactId>
>
>                      <version>2.3.0</version>
>
>               </dependency>
>
>
>
>               <dependency>
>
>                      <groupId>dom4j</groupId>
>
>                      <artifactId>dom4j</artifactId>
>
>                      <version>1.6.1</version>
>
>               </dependency>
>
>
>
>               <dependency>
>
>                      <groupId>javax.xml.stream</groupId>
>
>                      <artifactId>*stax*-*api*</artifactId>
>
>                      <version>1.0-2</version>
>
>               </dependency>
>
>
>
>               <dependency>
>
>                      <groupId>org.apache.poi</groupId>
>
>                      <artifactId>*poi*</artifactId>
>
>                      <version>3.9</version>
>
>               </dependency>
>
>
>
>               <dependency>
>
>                      <groupId>org.apache.poi</groupId>
>
>                      <artifactId>*poi*-*ooxml*-schemas</artifactId>
>
>                      <version>3.9</version>
>
>               </dependency>
>
>               <dependency>
>
>                      <groupId>com.jcraft</groupId>
>
>                      <artifactId>*jsch*</artifactId>
>
>                      <version>0.1.53</version>
>
>               </dependency>
>
>               <dependency>
>
>                      <groupId>com.jcraft</groupId>
>
>                      <artifactId>*jsch*</artifactId>
>
>                      <version>0.1.53</version>
>
>               </dependency>
>
>        </dependencies>
>
>
>
> </project>
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Munagala Ramanath [mailto:ram@datatorrent.com]
> *Sent:* 2016, June, 16 4:37 PM
> *To:* users@apex.apache.org
> *Subject:* Re: Multiple directories
>
>
>
> It looks like you may be including old Hadoop jars in your apa package
> since the stack trace
>
> shows *ConverterUtils.toContainerId* calling
> *ConverterUtils.toApplicationAttemptId* but recent versions
>
> don't have that call sequence. In 2.7.1 (which is what your cluster has)
> the function looks like this:
>
>  * public static ContainerId toContainerId(String containerIdStr) {*
>
> *    return ContainerId.fromString(containerIdStr);*
>
> *  }*
>
>
>
> Could you post the output of "*jar tvf {your-apa-file}*" as well as: "*mvn
> dependency:tree"*
>
>
>
> Ram
>
>
>
> On Thu, Jun 16, 2016 at 12:38 PM, Mukkamula, Suryavamshivardhan (CWM-NR) <
> suryavamshivardhan.mukkamula@rbc.com> wrote:
>
> Hi Ram,
>
>
>
> Below is the information.
>
>
>
>
>
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time
> Current
>
>                                  Dload  Upload   Total   Spent    Left
> Speed
>
> 100   712    0   712    0     0   3807      0 --:--:-- --:--:-- --:--:--
> 3807
>
> {
>
>     "clusterInfo": {
>
>         "haState": "ACTIVE",
>
>         "haZooKeeperConnectionState": "CONNECTED",
>
>         "hadoopBuildVersion": "2.7.1.2.3.2.0-2950 from
> 5cc60e0003e33aa98205f18bc
>
> caeaf36cb193c1c by jenkins source checksum 69a3bf8c667267c2c252a54fbbf23d",
>
>         "hadoopVersion": "2.7.1.2.3.2.0-2950",
>
>         "hadoopVersionBuiltOn": "2015-09-30T18:08Z",
>
>         "id": 1465495186350,
>
>         "resourceManagerBuildVersion": "2.7.1.2.3.2.0-2950 from
> 5cc60e0003e33aa9
>
> 8205f18bccaeaf36cb193c1c by jenkins source checksum
> 48db4b572827c2e9c2da66982d14
>
> 7626",
>
>         "resourceManagerVersion": "2.7.1.2.3.2.0-2950",
>
>        "resourceManagerVersionBuiltOn": "2015-09-30T18:20Z",
>
>         "rmStateStoreName":
> "org.apache.hadoop.yarn.server.resourcemanager.recov
>
> ery.ZKRMStateStore",
>
>         "startedOn": 1465495186350,
>
>         "state": "STARTED"
>
>     }
>
> }
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Munagala Ramanath [mailto:ram@datatorrent.com]
> *Sent:* 2016, June, 16 2:57 PM
> *To:* users@apex.apache.org
> *Subject:* Re: Multiple directories
>
>
>
> Can you ssh to one of the cluster nodes ? If so, can you run this command
> and show the output
>
> (where *{rm} *is the *host:port* running the resource manager, aka YARN):
>
>
>
> *curl http://{rm}/ws/v1/cluster <http://%7brm%7d/ws/v1/cluster> | python
> -mjson.tool*
>
>
>
> Ram
>
> ps. You can determine the node running YARN with:
>
>
>
> *hdfs getconf -confKey yarn.resourcemanager.webapp.address*
>
> *hdfs getconf -confKey yarn.resourcemanager.webapp.https.address*
>
>
>
>
>
>
>
> On Thu, Jun 16, 2016 at 11:15 AM, Mukkamula, Suryavamshivardhan (CWM-NR) <
> suryavamshivardhan.mukkamula@rbc.com> wrote:
>
> Hi,
>
>
>
> I am facing a weird  issue and the logs are not clear to me !!
>
>
>
> I have created apa file which works fine within my local sandbox but
> facing problems when I upload on the enterprise Hadoop cluster using DT
> Console.
>
>
>
> Below is the error message from yarn logs. Please help in understanding
> the issue.
>
>
>
> ###################### Error Logs
> ########################################################
>
>
>
> Log Type: AppMaster.stderr
>
> Log Upload Time: Thu Jun 16 14:07:46 -0400 2016
>
> Log Length: 1259
>
> SLF4J: Class path contains multiple SLF4J bindings.
>
> SLF4J: Found binding in
> [jar:file:/grid/06/hadoop/yarn/local/usercache/mukkamula/appcache/application_1465495186350_2224/filecache/36/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: Found binding in
> [jar:file:/usr/hdp/2.3.2.0-2950/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
>
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>
> Exception in thread "main" java.lang.IllegalArgumentException: Invalid
> ContainerId: container_e35_1465495186350_2224_01_000001
>
>         at
> org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:182)
>
>         at
> com.datatorrent.stram.StreamingAppMaster.main(StreamingAppMaster.java:90)
>
> Caused by: java.lang.NumberFormatException: For input string: "e35"
>
>         at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>
>         at java.lang.Long.parseLong(Long.java:441)
>
>         at java.lang.Long.parseLong(Long.java:483)
>
>         at
> org.apache.hadoop.yarn.util.ConverterUtils.toApplicationAttemptId(ConverterUtils.java:137)
>
>         at
> org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:177)
>
>         ... 1 more
>
>
>
> Log Type: AppMaster.stdout
>
> Log Upload Time: Thu Jun 16 14:07:46 -0400 2016
>
> Log Length: 0
>
>
>
> Log Type: dt.log
>
> Log Upload Time: Thu Jun 16 14:07:46 -0400 2016
>
> Log Length: 29715
>
> Showing 4096 bytes of 29715 total. Click here
> <http://guedlpdhdp001.saifg.rbc.com:19888/jobhistory/logs/guedlpdhdp012.saifg.rbc.com:45454/container_e35_1465495186350_2224_01_000001/container_e35_1465495186350_2224_01_000001/mukkamula/dt.log/?start=0> for
> the full log.
>
> 56m -Xloggc:/var/log/hadoop/yarn/gc.log-201606140038 -verbose:gc
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms4096m
> -Xmx4096m -Dhadoop.security.logger=INFO,DRFAS
> -Dhdfs.audit.logger=INFO,DRFAAUDIT
>
> SHLVL=3
>
> HADOOP_SSH_OPTS=-o ConnectTimeout=5 -o SendEnv=HADOOP_CONF_DIR
>
> HADOOP_USER_NAME=datatorrent/gueulvahal003.saifg.rbc.com@SAIFG.RBC.COM
>
> HADOOP_NAMENODE_OPTS=-server -XX:ParallelGCThreads=8
> -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/yarn/hs_err_pid%p.log
> -XX:NewSize=200m -XX:MaxNewSize=200m -XX:PermSize=128m -XX:MaxPermSize=256m
> -Xloggc:/var/log/hadoop/yarn/gc.log-201606140038 -verbose:gc
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms8192m
> -Xmx8192m -Dhadoop.security.logger=INFO,DRFAS
> -Dhdfs.audit.logger=INFO,DRFAAUDIT
> -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node"
> -Dorg.mortbay.jetty.Request.maxFormContentSize=-1 -server
> -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC
> -XX:ErrorFile=/var/log/hadoop/yarn/hs_err_pid%p.log -XX:NewSize=200m
> -XX:MaxNewSize=200m -XX:PermSize=128m -XX:MaxPermSize=256m
> -Xloggc:/var/log/hadoop/yarn/gc.log-201606140038 -verbose:gc
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms8192m
> -Xmx8192m -Dhadoop.security.logger=INFO,DRFAS
> -Dhdfs.audit.logger=INFO,DRFAAUDIT
> -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node"
> -Dorg.mortbay.jetty.Request.maxFormContentSize=-1
>
> HADOOP_IDENT_STRING=yarn
>
> HADOOP_MAPRED_LOG_DIR=/var/log/hadoop-mapreduce/yarn
>
> NM_HOST=guedlpdhdp012.saifg.rbc.com
>
> XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt
>
> HADOOP_SECURE_DN_LOG_DIR=/var/log/hadoop/hdfs
>
> YARN_HISTORYSERVER_HEAPSIZE=1024
>
> JVM_PID=2638
>
> YARN_PID_DIR=/var/run/hadoop-yarn/yarn
>
> HADOOP_HOME_WARN_SUPPRESS=1
>
> NM_PORT=45454
>
> LOGNAME=mukkamula
>
> YARN_CONF_DIR=/usr/hdp/current/hadoop-client/conf
>
> HADOOP_YARN_USER=yarn
>
> QTDIR=/usr/lib64/qt-3.3
>
> _=/usr/lib/jvm/java-1.7.0/bin/java
>
> MSM_PRODUCT=MSM
>
> HADOOP_HOME=/usr/hdp/2.3.2.0-2950/hadoop
>
> MALLOC_ARENA_MAX=4
>
> HADOOP_OPTS=-Dhdp.version=2.3.2.0-2950 -Djava.net.preferIPv4Stack=true
> -Dhdp.version= -Djava.net.preferIPv4Stack=true
> -Dhadoop.log.dir=/var/log/hadoop/yarn -Dhadoop.log.file=hadoop.log
> -Dhadoop.home.dir=/usr/hdp/2.3.2.0-2950/hadoop -Dhadoop.id.str=yarn
> -Dhadoop.root.logger=INFO,console
> -Djava.library.path=:/usr/hdp/2.3.2.0-2950/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.3.2.0-2950/hadoop/lib/native
> -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true
> -Dhdp.version=2.3.2.0-2950 -Dhadoop.log.dir=/var/log/hadoop/yarn
> -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/2.3.2.0-2950/hadoop
> -Dhadoop.id.str=yarn -Dhadoop.root.logger=INFO,console
> -Djava.library.path=:/usr/hdp/2.3.2.0-2950/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.3.2.0-2950/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.3.2.0-2950/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.3.2.0-2950/hadoop/lib/native
> -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true
>
> SHELL=/bin/bash
>
> YARN_ROOT_LOGGER=INFO,EWMA,RFA
>
>
> HADOOP_TOKEN_FILE_LOCATION=/grid/11/hadoop/yarn/local/usercache/mukkamula/appcache/application_1465495186350_2224/container_e35_1465495186350_2224_01_000001/container_tokens
>
>
> CLASSPATH=./*:/usr/hdp/current/hadoop-client/conf:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*
>
> HADOOP_MAPRED_PID_DIR=/var/run/hadoop-mapreduce/yarn
>
> YARN_NODEMANAGER_HEAPSIZE=1024
>
> QTINC=/usr/lib64/qt-3.3/include
>
> USER=mukkamula
>
> HADOOP_CLIENT_OPTS=-Xmx2048m -XX:MaxPermSize=512m -Xmx2048m
> -XX:MaxPermSize=512m
>
> CONTAINER_ID=container_e35_1465495186350_2224_01_000001
>
> HADOOP_SECURE_DN_PID_DIR=/var/run/hadoop/hdfs
>
> HISTCONTROL=ignoredups
>
> HOME=/home/
>
> HADOOP_NAMENODE_INIT_HEAPSIZE=-Xms8192m
>
> MSM_HOME=/usr/local/MegaRAID Storage Manager
>
> LESSOPEN=||/usr/bin/lesspipe.sh %s
>
> LANG=en_US.UTF-8
>
> YARN_NICENESS=0
>
> YARN_IDENT_STRING=yarn
>
> HADOOP_MAPRED_HOME=/usr/hdp/2.3.2.0-2950/hadoop-mapreduce
>
>
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Mukkamula, Suryavamshivardhan (CWM-NR)
> *Sent:* 2016, June, 16 8:58 AM
> *To:* users@apex.apache.org
> *Subject:* RE: Multiple directories
>
>
>
> Thank you for the inputs.
>
>
>
> Regards,
>
> Surya Vamshi
>
> *From:* Thomas Weise [mailto:thomas.weise@gmail.com
> <thomas.weise@gmail.com>]
> *Sent:* 2016, June, 15 5:08 PM
> *To:* users@apex.apache.org
> *Subject:* Re: Multiple directories
>
>
>
>
>
> On Wed, Jun 15, 2016 at 1:55 PM, Mukkamula, Suryavamshivardhan (CWM-NR) <
> suryavamshivardhan.mukkamula@rbc.com> wrote:
>
> Hi Ram/Team,
>
>
>
> I could create an operator which reads multiple directories and parses the
> each file with respect to an individual configuration file and generates
> output file to different directories.
>
>
>
> However I have some questions regarding the design.
>
>
>
> èWe have 120 directories to scan on HDFS, if we use parallel partitioning
> with operator memory around 250MB , it might be around 30GB of RAM for the
> processing of this operator, are these figures going to create any problem
> in production ?
>
>
>
> You can benchmark this with a single partition. If the downstream
> operators can keep up with the rate at which the file reader emits, then
> the memory consumption should be minimal. Keep in mind though that the
> container memory is not just heap space for the operator, but also memory
> the JVM requires to run and the memory that the buffer server consumes. You
> see the allocated memory in the UI if you use the DT community edition
> (container list in the physical plan).
>
>
>
> èShould I use a scheduler for running the batch job (or) define next scan
> time and make the DT job running continuously ? if I run DT job
> continuously I assume memory will be continuously utilized by the DT Job it
> is not available to other resources on the cluster, please clarify.
>
> It is possible to set this up elastically also, so that when there is no
> input available, the number of reader partitions are reduced and the memory
> given back (Apex supports dynamic scaling).
>
>
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Munagala Ramanath [mailto:ram@datatorrent.com]
> *Sent:* 2016, June, 05 10:24 PM
>
>
> *To:* users@apex.apache.org
> *Subject:* Re: Multiple directories
>
>
>
> Some sample code to monitor multiple directories is now available at:
>
>
> https://github.com/DataTorrent/examples/tree/master/tutorials/fileIO-multiDir
>
>
>
> It shows how to use a custom implementation of definePartitions() to create
>
> multiple partitions of the file input operator and group them
>
> into "slices" where each slice monitors a single directory.
>
>
>
> Ram
>
>
>
> On Wed, May 25, 2016 at 9:55 AM, Munagala Ramanath <ram@datatorrent.com>
> wrote:
>
> I'm hoping to have a sample sometime next week.
>
>
>
> Ram
>
>
>
> On Wed, May 25, 2016 at 9:30 AM, Mukkamula, Suryavamshivardhan (CWM-NR) <
> suryavamshivardhan.mukkamula@rbc.com> wrote:
>
> Thank you so much ram, for your advice , Option (a) would be ideal for my
> requirement.
>
>
>
> Do you have sample usage for partitioning with individual configuration
> set ups different partitions?
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Munagala Ramanath [mailto:ram@datatorrent.com]
> *Sent:* 2016, May, 25 12:11 PM
> *To:* users@apex.apache.org
> *Subject:* Re: Multiple directories
>
>
>
> You have 2 options: (a) AbstractFileInputOperator (b)
> FileSplitter/BlockReader
>
>
>
> For (a), each partition (i.e. replica or the operator) can scan only a
> single directory, so if you have 100
>
> directories, you can simply start with 100 partitions; since each
> partition is scanning its own directory
>
> you don't need to worry about which files the lines came from. This
> approach however needs a custom
>
> definePartition() implementation in your subclass to assign the
> appropriate directory and XML parsing
>
> config file to each partition; it also needs adequate cluster resources to
> be able to spin up the required
>
> number of partitions.
>
>
>
> For (b), there is some documentation in the Operators section at
> http://docs.datatorrent.com/ including
>
> sample code. There operators support scanning multiple directories out of
> the box but have more
>
> elaborate configuration options. Check this out and see if it works in
> your use case.
>
>
>
> Ram
>
>
>
> On Wed, May 25, 2016 at 8:17 AM, Mukkamula, Suryavamshivardhan (CWM-NR) <
> suryavamshivardhan.mukkamula@rbc.com> wrote:
>
> Hello Ram/Team,
>
>
>
> My requirement is to read input feeds from different locations on HDFS and
> parse those files by reading XML configuration files (each input feed has
> configuration file which defines the fields inside the input feeds).
>
>
>
> My approach : I would like to define a mapping file which contains
> individual feed identifier, feed location , configuration file location. I
> would like to read this mapping file at initial load within setup() method
> and define my DirectoryScan.acceptFiles. Here my challenge is when I read
> the files , I should parse the lines by reading the individual
> configuration files. How do I know the line is from particular file , if I
> know this I can read the corresponding configuration file before parsing
> the line.
>
>
>
> Please let me know how do I handle this.
>
>
>
> Regards,
>
> Surya Vamshi
>
>
>
> *From:* Munagala Ramanath [mailto:ram@datatorrent.com]
> *Sent:* 2016, May, 24 5:49 PM
> *To:* Mukkamula, Suryavamshivardhan (CWM-NR)
> *Subject:* Multiple directories
>
>
>
> One way of addressing the issue is to use some sort of external tool (like
> a script) to
>
> copy all the input files to a common directory (making sure that the file
> names are
>
> unique to prevent one file from overwriting another) before the Apex
> application starts.
>
>
>
> The Apex application then starts and processes files from this directory.
>
>
>
> If you set the partition count of the file input operator to N, it will
> create N partitions and
>
> the files will be automatically distributed among the partitions. The
> partitions will work
>
> in parallel.
>
>
>
> Ram
>
> _______________________________________________________________________
>
> This [email] may be privileged and/or confidential, and the sender does
> not waive any related rights and obligations. Any distribution, use or
> copying of this [email] or the information it contains by other than an
> intended recipient is unauthorized. If you received this [email] in error,
> please advise the sender (by return [email] or otherwise) immediately. You
> have consented to receive the attached electronically at the above-noted
> address; please retain a copy of this confirmation for future reference.
>
>
>
> _______________________________________________________________________
>
> This [email] may be privileged and/or confidential, and the sender does
> not waive any related rights and obligations. Any distribution, use or
> copying of this [email] or the information it contains by other than an
> intended recipient is unauthorized. If you received this [email] in error,
> please advise the sender (by return [email] or otherwise) immediately. You
> have consented to receive the attached electronically at the above-noted
> address; please retain a copy of this confirmation for future reference.
>
>
>
>
>
> _______________________________________________________________________
>
> This [email] may be privileged and/or confidential, and the sender does
> not waive any related rights and obligations. Any distribution, use or
> copying of this [email] or the information it contains by other than an
> intended recipient is unauthorized. If you received this [email] in error,
> please advise the sender (by return [email] or otherwise) immediately. You
> have consented to receive the attached electronically at the above-noted
> address; please retain a copy of this confirmation for future reference.
>
>
>
> _______________________________________________________________________
>
> This [email] may be privileged and/or confidential, and the sender does
> not waive any related rights and obligations. Any distribution, use or
> copying of this [email] or the information it contains by other than an
> intended recipient is unauthorized. If you received this [email] in error,
> please advise the sender (by return [email] or otherwise) immediately. You
> have consented to receive the attached electronically at the above-noted
> address; please retain a copy of this confirmation for future reference.
>
>
>
> _______________________________________________________________________
>
> This [email] may be privileged and/or confidential, and the sender does
> not waive any related rights and obligations. Any distribution, use or
> copying of this [email] or the information it contains by other than an
> intended recipient is unauthorized. If you received this [email] in error,
> please advise the sender (by return [email] or otherwise) immediately. You
> have consented to receive the attached electronically at the above-noted
> address; please retain a copy of this confirmation for future reference.
>
>
>
> _______________________________________________________________________
>
> This [email] may be privileged and/or confidential, and the sender does
> not waive any related rights and obligations. Any distribution, use or
> copying of this [email] or the information it contains by other than an
> intended recipient is unauthorized. If you received this [email] in error,
> please advise the sender (by return [email] or otherwise) immediately. You
> have consented to receive the attached electronically at the above-noted
> address; please retain a copy of this confirmation for future reference.
>
>
>
> _______________________________________________________________________
>
> This [email] may be privileged and/or confidential, and the sender does
> not waive any related rights and obligations. Any distribution, use or
> copying of this [email] or the information it contains by other than an
> intended recipient is unauthorized. If you received this [email] in error,
> please advise the sender (by return [email] or otherwise) immediately. You
> have consented to receive the attached electronically at the above-noted
> address; please retain a copy of this confirmation for future reference.
>
> _______________________________________________________________________
>
> This [email] may be privileged and/or confidential, and the sender does
> not waive any related rights and obligations. Any distribution, use or
> copying of this [email] or the information it contains by other than an
> intended recipient is unauthorized. If you received this [email] in error,
> please advise the sender (by return [email] or otherwise) immediately. You
> have consented to receive the attached electronically at the above-noted
> address; please retain a copy of this confirmation for future reference.
>
> _______________________________________________________________________
>
> This [email] may be privileged and/or confidential, and the sender does
> not waive any related rights and obligations. Any distribution, use or
> copying of this [email] or the information it contains by other than an
> intended recipient is unauthorized. If you received this [email] in error,
> please advise the sender (by return [email] or otherwise) immediately. You
> have consented to receive the attached electronically at the above-noted
> address; please retain a copy of this confirmation for future reference.
>
>
>
> _______________________________________________________________________
>
> This [email] may be privileged and/or confidential, and the sender does
> not waive any related rights and obligations. Any distribution, use or
> copying of this [email] or the information it contains by other than an
> intended recipient is unauthorized. If you received this [email] in error,
> please advise the sender (by return [email] or otherwise) immediately. You
> have consented to receive the attached electronically at the above-noted
> address; please retain a copy of this confirmation for future reference.
>
>

Mime
View raw message