lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doron Cohen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)
Date Tue, 22 Mar 2011 22:20:05 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13009869#comment-13009869
] 

Doron Cohen commented on LUCENE-2980:
-------------------------------------

Thanks Shai!

I fixed the super class and the assert as suggested.

For those nocommits, they stand for a larger problem - I was ready for a trivial fix for this
bug - just lower case the extension in ContentSource before consulting with the map. However
the test failed, and I found out that this is because the input stream returned by CompressorStreamFactory.createCompressorInputStream()
does not close its underlying stream when it is exhausted or when its close method is called.


I opened COMPRESS-127 for this.

As a workaround to this bug, ContentSource now returns a wrapper on the input stream created
by the CsFactory, delegates all methods to it, except for close() which is also delegated
to the underlying stream. This fix is required for the extension letter cases tests to pass,
but it fixes a more serious problem, - leaking file handles in ContentSource.

As Solr also makes use of CommonCompress I searched in it for references to CompressorStreamFactory.createCompressorInputStream()
but found none, so it seems Solr is not affected by COMPRESS-127.

> Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting
file type (gzip/bzip2/text)
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2980
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2980
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/benchmark
>            Reporter: Doron Cohen
>            Assignee: Doron Cohen
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-2980.patch
>
>
> file.gz is correctly handled as gzip, but file.GZ handled as text which is wrong.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message