falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venkatesh Seetharam (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-169) multiple "/" in target for replication for multi target feed
Date Tue, 05 Nov 2013 18:29:18 GMT

    [ https://issues.apache.org/jira/browse/FALCON-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814099#comment-13814099
] 

Venkatesh Seetharam commented on FALCON-169:
--------------------------------------------

Thanks [~shwethags] and [~samarthg] for looking into this. But I wonder how there can be multiple
'//' in the path. The code normalizes it as below:

{code}
        private void propagateFileSystemCopyProperties(String pathsWithPartitions,
                                                       Map<String, String> props) throws
FalconException {
            String parts = pathsWithPartitions.replaceAll("//+", "/");
            parts = StringUtils.stripEnd(parts, "/");
            props.put("sourceRelativePaths", parts);

            props.put("distcpSourcePaths", "${coord:dataIn('input')}");
            props.put("distcpTargetPaths", "${coord:dataOut('output')}");
        }
{code}
sourceRelativePaths is substituted for falcon.include.path.
{code}
            <main-class>org.apache.falcon.replication.FeedReplicator</main-class>
            <arg>-Dfalcon.include.path=${sourceRelativePaths}</arg>
{code}

I must be missing something here.

> multiple "/" in target for replication for multi target feed 
> -------------------------------------------------------------
>
>                 Key: FALCON-169
>                 URL: https://issues.apache.org/jira/browse/FALCON-169
>             Project: Falcon
>          Issue Type: Bug
>          Components: replication
>         Environment: QA
>            Reporter: Samarth Gupta
>            Assignee: Venkatesh Seetharam
>
> multiple "/" are getting appended to target dir, before concatenating partition exp postfix.

> For example while running single source multi target test, following is the value being
passed to distCp which can be viewed in tasktracker logs: 
> ** for patch from FALCON-163
> {quote} 
> -Dfalcon.include.path=hdfs://gs1001.grid.corp.inmobi.com:54310/localDC/rc/billing/2012/10/01/12/10//ua3
> at the bottom of logs is can be seen:
> 2013-11-05 06:33:20,219 INFO  - Inclusion pattern = hdfs://gs1001.grid.corp.inmobi.com:54310/localDC/rc/billing/2012/10/01/12/10//ua3
(FilteredCopyListing:59)
> 2013-11-05 06:33:20,219 INFO  - Regex pattern = (hdfs://gs1001\.grid\.corp\.inmobi\.com:54310/localDC/rc/billing/2012/10/01/12/10//ua3/)|(hdfs://gs1001\.grid\.corp\.inmobi\.com:54310/localDC/rc/billing/2012/10/01/12/10//ua3$)
(FilteredCopyListing:60)
> 2013-11-05 06:33:20,460 INFO  - Number of paths considered for copy: 0 (CustomReplicator:57)
> 2013-11-05 06:33:20,461 INFO  - Number of bytes considered for copy: 0 (Actual number
of bytes copied depends on whether any files are skipped or overwritten.) (CustomReplicator:58)
> 2013-11-05 06:33:21,212 INFO  - DistCp job-id: job_201310290719_0445 (DistCp:146)
> 2013-11-05 06:33:21,213 INFO  - DistCp job may be tracked at: http://ivoryqa-1.corp.inmobi.com:50030/jobdetails.jsp?jobid=job_201310290719_0445
(DistCp:147)
> 2013-11-05 06:33:21,213 INFO  - To cancel, run the following command:	hadoop job -kill
job_201310290719_0445 (DistCp:148)
> 2013-11-05 06:33:21,213 INFO  - Running job: job_201310290719_0445 (JobClient:1315)
> 2013-11-05 06:33:22,216 INFO  -  map 0% reduce 0% (JobClient:1328)
> 2013-11-05 06:33:33,244 INFO  - Job complete: job_201310290719_0445 (JobClient:1383)
> 2013-11-05 06:33:33,252 INFO  - Counters: 4 (JobClient:589)
> 2013-11-05 06:33:33,252 INFO  -   Job Counters  (JobClient:591)
> 2013-11-05 06:33:33,253 INFO  -     SLOTS_MILLIS_MAPS=5822 (JobClient:593)
> 2013-11-05 06:33:33,253 INFO  -     Total time spent by all reduces waiting after reserving
slots (ms)=0 (JobClient:593)
> 2013-11-05 06:33:33,254 INFO  -     Total time spent by all maps waiting after reserving
slots (ms)=0 (JobClient:593)
> 2013-11-05 06:33:33,255 INFO  -     SLOTS_MILLIS_REDUCES=0 (JobClient:593)
> 2013-11-05 06:33:33,307 INFO  - No files present in path: hdfs://ivoryqa-1.corp.inmobi.com:8020/localDC/rc/billing/ua2/2012/10/01/12/10/ua3
(FeedReplicator:146)
> 2013-11-05 06:33:33,308 INFO  - Completed DistCp (FeedReplicator:77)
> {quote}
> where as if same is run on the current code from trunk, following are the values in task
tracker: 
> {quote}
> -Dfalcon.include.path=hdfs://gs1001.grid.corp.inmobi.com:54310/localDC/rc/billing/2012/10/01/12/10/ua3
> {quote}
> and replication is successful ..... 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message