hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-5121) Problem with field separator in FieldSelectionHelper
Date Thu, 19 Mar 2015 06:50:39 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Harsh J updated MAPREDUCE-5121:
-------------------------------
    Affects Version/s:     (was: 1.0.4)
                       2.6.0
               Status: Open  (was: Patch Available)

Thanks Kai! The problem sounds legit but I doubt the patch provided would help entirely in
preventing it. If providing meta characters gets needed, the XML config serialisation would
be the first breaking part I'd imagine. We'd need to base64 encode the data for field separator
to even allow that.

Could you update the patch with a test case illustrating the issue, if you still plan to work
on it?

> Problem with field separator in FieldSelectionHelper
> ----------------------------------------------------
>
>                 Key: MAPREDUCE-5121
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5121
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Kai Wei
>            Priority: Trivial
>         Attachments: MAPREDUCE-5121.patch
>
>
> I found that org.apache.hadoop.mapreduce.lib.fieldsel.FieldSelectionHelper and the corresponding
old api org.apache.hadoop.mapred.lib.FieldSelectionMapReduce take user specified separator
string as a regular expression in String.split(), but also use it as a normal string in StringBuffer.append().
It will be a problem if the separator string contains meta character. I suggest take separator
literally by calling Pattern.quote(separator). Or just use another property to specify the
separator which should be added in the output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message