hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Utkarsh Srivastava (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-36) FindBugs: Method ignores results of InputStream.skip()
Date Wed, 28 Nov 2007 04:42:43 GMT

    [ https://issues.apache.org/jira/browse/PIG-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546105

Utkarsh Srivastava commented on PIG-36:

RandomSampleLoader is just meant to load a few random rows of the input. So it does not care
whether we were actually able to skip the amount requested or not. In fact, we want to make
every disk seek count to give us a sample (the bigger our sample size, the better our quantile
accuracy). So if we called skip, and that didnt skip the requested amount, we would still
want to get a sample from the current position before moving on.

> FindBugs: Method ignores results of InputStream.skip()
> ------------------------------------------------------
>                 Key: PIG-36
>                 URL: https://issues.apache.org/jira/browse/PIG-36
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Patrick Hunt
> InputStreams don't always skip as much as they are asked to skip, need to do this in
a loop:
> 		if (toSkip > 0)
> 			in.skip(toSkip);
> 		return t;
> Severity and Description	Path	Resource	Location	Creation Time	Id
> M B RR: org.apache.pig.impl.builtin.RandomSampleLoader.getNext() ignores result of org.apache.pig.impl.io.BufferedPositionedInputStream.skip(long)
pig-apache/src/org/apache/pig/impl/builtin	RandomSampleLoader.java	line 49	1196213971062	22891

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message