commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebb (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IO-288) Supply a ReversedLinesFileReader
Date Sat, 12 Nov 2011 02:10:51 GMT

    [ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148939#comment-13148939
] 

Sebb commented on IO-288:
-------------------------

BufferedReader.readLine() allows for LF, CRLF and CR line termination; perhaps this should
too?

What about multi-byte encodings?
Can these ever have CR or LF as part of a multi-byte character?
Won't the processing fail in such cases?
If so, is that a restriction, or can it be fixed?

I'm wondering whether it would be possible to use BufferedReader.readLine() to scan forward
through the buffer, saving up lines as it goes, and then return the lines in reverse order.
This would use a bit more memory, but would re-use the readLine processing which is well-tested.
It can also allow for the encoding, though there is still a potential issue where a buffer
happens to start in the middle of a multi-byte character. It might be possible to check the
first few bytes of the buffer and adjust the start offset if necessary.

We don't use @author tags in code; contributors are credited on the website instead.
Also, files should have AL headers please.
                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader.zip
>
>
> I needed to analyse a log file today and I was looking for a ReversedLinesFileReader:
A class that behaves exactly like BufferedReader except that it goes from bottom to top when
readLine() is called. I didn't find it in IOUtils and the internet didn't help a lot either,
e.g. http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly inefficient
- the log files I'm analysing are huge and it is not a good idea to load the whole content
in the memory. 
> So I ended up writing an implementation myself using little memory and the class RandomAccessFile
- see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big the file
is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader (myFile,
blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message