hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5307) Fix null value handling in StringUtils#arrayToString() and #getStrings()
Date Fri, 13 Mar 2009 14:41:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681732#action_12681732
] 

Enis Soztutar commented on HADOOP-5307:
---------------------------------------

bq. These read as mutually exclusive statements. If escaping "__null__" is backwards-incompatible,
then escaping null is, too. bq. That any escape sequence needs a way to represent itself literally
isn't controversial. Further, the change is to a public utility class: even if we defined
"incompatible" statistically, the frequency of either case is unknown since its domain is
undefined.
Of course the patch is technically backwards-incompatible, what I say is that, the frequency
is expected to be so small that it is negligible.

bq. I'm a little confused about why StringUtils needs its own escape for a null reference.
Converting an array of String with a null reference to a String with a special symbol for
null is odd. It's cleaner to throw from StringUtils if it gets null and require callers to
handle escaping.
passing an array, possibly containing null values seems to me generic enough to be introduced
in StringUtils rather than a custom solution in the context(DBOutputFormat).

bq. HADOOP-4955 probably should not have been committed to 0.19.1. It reads as an improvement,
not a fix to a regression.
The API indicates to use the supplied column names, when in fact it did not, so 4955 is a
bug fix, fixing the expected behavior from the API, rather than an improvement, introducing
a new API. 

bq. What was committed didn't match the patch posted here. There was a conflict in the imports
for StringUtils.
The patch is committed w/o changes to import statements. Committing a patch with minor changes
is not uncommon. However I should have stated that. 
bq. One of the cases added in testArrayToStringToArray (nullArr3) doesn't have a corresponding
test.
Will fix

At this point I think we can continue with the patch in three ways : 
1. Introduce new String[] -> String, and String -> String[] array methods handling possible
null values. 
2. Mark the patch incompatible and commit this one
3. introduce a new API to set only the number of columns(not the names) in the DBConfiguration.
However this does not fix the passing null values in configuration problem, if there is such.


What would you suggest? 


> Fix null value handling in StringUtils#arrayToString() and #getStrings()
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-5307
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5307
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.21.0
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>         Attachments: h5307_v1.patch
>
>
> StringUtils#arrayToString() converts String array to a String of comma separated elements.
If the String array includes null values, these are recovered as "null" (literal) from getStrings()
method, which eventually causes configuration issues. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message