pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-3796) PigStats output bytes written not collected for relative paths
Date Tue, 04 Mar 2014 22:11:43 GMT

     [ https://issues.apache.org/jira/browse/PIG-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rohini Palaniswamy updated PIG-3796:
------------------------------------

    Description: 
PIG-2924 added support for custom stats reader. But the FileBasedOutputSizeReader only checks
for 

{code}
public static boolean isHDFSFileOrLocalOrS3N(String uri){
        if(uri == null)
            return false;
        if(uri.startsWith("/") || uri.matches("[A-Za-z]:.*") || uri.startsWith("hdfs:")
                || uri.startsWith("viewfs:") || uri.startsWith("file:") || uri.startsWith("s3n:"))
{
            return true;
        }
        return false;
    }
{code}
Better to change this to UriUtil.hasFileSystemImpl which will automatically filter out hbase://.
 This would still not solve cases like HCatStorer which does not have a scheme. Will also
write a default stats reader that checks for known StoreFuncInterface implementations that
are not file based like HCatStorer. More standard ones can be added later. AccumuloStorage
should not be a problem as it has scheme accumulo://.

  was:
PIG-2924 added support for custom stats reader. But the FileBasedOutputSizeReader only checks
for 

public static boolean isHDFSFileOrLocalOrS3N(String uri){
        if(uri == null)
            return false;
        if(uri.startsWith("/") || uri.matches("[A-Za-z]:.*") || uri.startsWith("hdfs:")
                || uri.startsWith("viewfs:") || uri.startsWith("file:") || uri.startsWith("s3n:"))
{
            return true;
        }
        return false;
    }

Better to change this to UriUtil.hasFileSystemImpl which will automatically filter out hbase://.
 This would still not solve cases like HCatStorer which does not have a scheme. Will also
write a default stats reader that checks for known StoreFuncInterface implementations that
are not file based like HCatStorer. More standard ones can be added later. AccumuloStorage
should not be a problem as it has scheme accumulo://.


> PigStats output bytes written not collected for relative paths
> --------------------------------------------------------------
>
>                 Key: PIG-3796
>                 URL: https://issues.apache.org/jira/browse/PIG-3796
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>
> PIG-2924 added support for custom stats reader. But the FileBasedOutputSizeReader only
checks for 
> {code}
> public static boolean isHDFSFileOrLocalOrS3N(String uri){
>         if(uri == null)
>             return false;
>         if(uri.startsWith("/") || uri.matches("[A-Za-z]:.*") || uri.startsWith("hdfs:")
>                 || uri.startsWith("viewfs:") || uri.startsWith("file:") || uri.startsWith("s3n:"))
{
>             return true;
>         }
>         return false;
>     }
> {code}
> Better to change this to UriUtil.hasFileSystemImpl which will automatically filter out
hbase://.  This would still not solve cases like HCatStorer which does not have a scheme.
Will also write a default stats reader that checks for known StoreFuncInterface implementations
that are not file based like HCatStorer. More standard ones can be added later. AccumuloStorage
should not be a problem as it has scheme accumulo://.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message