lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Lamprecht <clampre...@gmail.com>
Subject Re: Zip Files
Date Tue, 01 Mar 2005 19:44:13 GMT
Luke,

Look at the javadocs for java.io.ByteArrayInputStream - it wraps a
byte array and makes it accessible as an InputStream.  Also see
java.util.zip.ZipFile.  You should be able to read and parse all
contents of the zip file in memory.

http://java.sun.com/j2se/1.4.2/docs/api/java/io/ByteArrayInputStream.html


On Tue, 1 Mar 2005 12:39:17 -0500, Luke Shannon
<lshannon@futurebrand.com> wrote:
> Thanks Ernesto.
> 
> I'm struggling with how I can work with an  array of bytes  instead of a
> Java File.
> 
> It would be easier to unzip the zip to a temp directory, parse the files and
> than delete the directory. But this would greatly slow indexing and use up
> disk space.
> 
> Luke
> 
> ----- Original Message -----
> From: "Ernesto De Santis" <ernesto.desantis@colaborativa.net>
> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> Sent: Tuesday, March 01, 2005 10:48 AM
> Subject: Re: Zip Files
> 
> > Hello
> >
> > first, you need a parser for each file type: pdf, txt, word, etc.
> > and use a java api to iterate zip content, see:
> >
> > http://java.sun.com/j2se/1.4.2/docs/api/java/util/zip/ZipInputStream.html
> >
> > use getNextEntry() method
> >
> > little example:
> >
> > ZipInputStream zis = new ZipInputStream(fileInputStream);
> > ZipEntry zipEntry;
> > while(zipEntry = zis.getNextEntry() != null){
> >     //use zipEntry to get name, etc.
> >     //get properly parser for current entry
> >     //use parser with zis (ZipInputStream)
> > }
> >
> > good luck
> > Ernesto
> >
> > Luke Shannon escribi├│:
> >
> > >Hello;
> > >
> > >Anyone have an ideas on how to index the contents within zip files?
> > >
> > >Thanks,
> > >
> > >Luke
> > >
> > >
> > >---------------------------------------------------------------------
> > >To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > >For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > >
> > >
> > >
> > >
> >
> > --
> > Ernesto De Santis - Colaborativa.net
> > C├│rdoba 1147 Piso 6 Oficinas 3 y 4
> > (S2000AWO) Rosario, SF, Argentina.
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message