lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Divya S. Jesuraj" <>
Subject RE: Powerpoint search using Lucene
Date Thu, 29 Jul 2004 12:49:31 GMT
The second link - does things a bit differently than one would expect.

It creates multiple files "1.txt", "2.txt", so on, extracts the text and
keeps it only in "1.txt" and doesn't save the name of the initial powerpoint
file so it can't link to it when you search for it.

What would be ideal is to extract the powerpoint text into an object
{String?} and create a Lucene Doc that would add it to the index...

I have been playing with the idea of using the code by Mr.Koundinya and
somehow storing those contents to a string object which then got added as
"content" to the Lucene Doc. The file name ( .ppt ) and path would get added
too...will let you folks know how it goes...


-----Original Message-----
From: Stephane James Vaucher [] 
Sent: Wednesday, July 28, 2004 11:41 PM
To: Lucene Developers List
Subject: Re: Powerpoint search using Lucene

I haven't, I've found a few link though...

I just saw this on the poi list. I can't confirm if it works or not (if
you try it, can you tell us)

This is a reference to some code that I found works on some ppts:


On Wed, 28 Jul 2004, Divya S. Jesuraj wrote:

> Hello,
> I am a VERY new Java Programmer and have now been thrust into development
> using Lucene. I was able to figure out parsing/indexing of MS Word, MS
> Excel, RTF, Text files, and PDFs with a lot of reading and using Poi& PDF
> Sandbox. I however haven't been able to do anything with PPTs [or htmls -
> that is the least of my worries]...
> I am indexing a directory on my machine and have a user interface with a
> JSP. Has anyone figured out how to get a Powerpoint search to work? I
> searched the forums but I can't find anything that would help my
> Some sample code would be appreciated.
> Thank you.
> ~Divya Jesuraj
> Technical Summer Intern 2004
> MITRE Corporation
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message