hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Bacsko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15822) zstd compressor can fail with a small output buffer
Date Mon, 08 Oct 2018 15:21:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642000#comment-16642000
] 

Peter Bacsko commented on HADOOP-15822:
---------------------------------------

I reproduced the problem. This is what happens if the sort buffer is 2047MiB.

{noformat}
...
2018-10-08 08:15:04,126 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling map output
2018-10-08 08:15:04,126 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 1267927860;
bufend = 2082571562; bufvoid = 2146435072
2018-10-08 08:15:04,126 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 316981960(1267927840);
kvend = 91355880(365423520); length = 225626081/134152192
2018-10-08 08:15:04,126 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR) -1997752227
kvi 37170708(148682832)
2018-10-08 08:16:24,712 INFO [SpillThread] org.apache.hadoop.mapred.MapTask: Finished spill
20
2018-10-08 08:16:24,712 INFO [main] org.apache.hadoop.mapred.MapTask: (RESET) equator -1997752227
kv 37170708(148682832) kvi 37170708(148682832)
2018-10-08 08:16:24,713 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map
output
2018-10-08 08:16:24,713 INFO [main] org.apache.hadoop.mapred.MapTask: (RESET) equator -1997752227
kv 37170708(148682832) kvi 37170708(148682832)
2018-10-08 08:16:24,727 INFO [main] org.apache.hadoop.mapred.Merger: Merging 21 sorted segments
2018-10-08 08:16:24,735 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,736 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,738 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,739 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,741 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,742 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,743 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,744 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,745 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,746 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,748 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,749 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,750 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,752 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,753 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,754 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,755 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,756 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,757 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,769 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
decompressor [.zst]
2018-10-08 08:16:24,770 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass,
with 21 segments left of total size: 35310116 bytes
2018-10-08 08:16:30,104 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running
child : java.lang.ArrayIndexOutOfBoundsException
	at java.lang.System.arraycopy(Native Method)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1469)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1365)
	at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
	at org.apache.hadoop.io.WritableUtils.writeVLong(WritableUtils.java:273)
	at org.apache.hadoop.io.WritableUtils.writeVInt(WritableUtils.java:253)
	at org.apache.hadoop.io.Text.write(Text.java:330)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:98)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:82)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1163)
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:727)
	at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
	at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:47)
	at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:36)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1726)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
{noformat}

[~jlowe] do you think it's related? Or is it something different, maybe MR-specific?

> zstd compressor can fail with a small output buffer
> ---------------------------------------------------
>
>                 Key: HADOOP-15822
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15822
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.9.0, 3.0.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Major
>         Attachments: HADOOP-15822.001.patch, HADOOP-15822.002.patch
>
>
> TestZStandardCompressorDecompressor fails a couple of tests on my machine with the latest
zstd library (1.3.5).  Compression can fail to successfully finalize the stream when a small
output buffer is used resulting in a failed to init error, and decompression with a direct
buffer can fail with an invalid src size error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message