hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bo Adler (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-3680) Streaming "slowmatch" documentation
Date Tue, 01 Jul 2008 22:20:44 GMT
Streaming "slowmatch" documentation

                 Key: HADOOP-3680
                 URL: https://issues.apache.org/jira/browse/HADOOP-3680
             Project: Hadoop Core
          Issue Type: Bug
          Components: contrib/streaming
    Affects Versions: 0.17.0
            Reporter: Bo Adler
            Priority: Trivial

The documentation for the Streaming module do not include any mention of the "slowmatch" parameter,
which checks for CDATA sections while looking for XML records.

An important point is that "slowmatch=true" violates the principle of least surprise: the
"begin" and "end" parameters become regular expressions instead of exact strings.  This is
probably a useful feature, but should definitely be noted since users will be tempted to use
the XML record reader on not-strictly-xml files, which may require escaping the "begin" and
"end" patterns.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message