[ https://issues.apache.org/jira/browse/HADOOP-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536013
]
Raghu Angadi commented on HADOOP-2071:
--------------------------------------
After a little bit more discussion it looks like using BufferedInputStream can get rid of
problem with seek-back as well. Because we are always seeking with-in what we have recently
read. So we would replace seek() with {{ reset(); skip(); }}.
> StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in hadoop 0.14
> -------------------------------------------------------------------------------------
>
> Key: HADOOP-2071
> URL: https://issues.apache.org/jira/browse/HADOOP-2071
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/streaming
> Affects Versions: 0.14.3
> Reporter: lohit vijayarenu
> Assignee: lohit vijayarenu
> Attachments: HADOOP-2071-1.patch
>
>
> In hadoop 0.14, using -inputreader StreamXmlRecordReader for streaming jobs throw
> java.io.IOException: Mark/reset exception in hadoop 0.14
> This looks to be related to (https://issues.apache.org/jira/browse/HADOOP-2067).
> <stack trace>
> Caused by: java.io.IOException: Mark/reset not supported
> at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.reset(DFSClient.java:1353)
> at java.io.FilterInputStream.reset(FilterInputStream.java:200)
> at
> org.apache.hadoop.streaming.StreamXmlRecordReader.fastReadUntilMatch(StreamX
> mlRecordReader.java:289)
> at
> org.apache.hadoop.streaming.StreamXmlRecordReader.readUntilMatchBegin(Stream
> XmlRecordReader.java:118)
> at
> org.apache.hadoop.streaming.StreamXmlRecordReader.seekNextRecordBoundary(Str
> eamXmlRecordReader.java:111)
> at
> org.apache.hadoop.streaming.StreamXmlRecordReader.init(StreamXmlRecordReader
> .java:73)
> at
> org.apache.hadoop.streaming.StreamXmlRecordReader.(StreamXmlRecordReader.jav
> a:63)
> </stack trace>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
|