From lucene-dev-return-6967-apmail-jakarta-lucene-dev-archive=jakarta.apache.org@jakarta.apache.org Thu Jul 29 12:50:07 2004 Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@www.apache.org Received: (qmail 81420 invoked from network); 29 Jul 2004 12:50:07 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 29 Jul 2004 12:50:07 -0000 Received: (qmail 30011 invoked by uid 500); 29 Jul 2004 12:49:52 -0000 Delivered-To: apmail-jakarta-lucene-dev-archive@jakarta.apache.org Received: (qmail 29928 invoked by uid 500); 29 Jul 2004 12:49:52 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 29687 invoked by uid 99); 29 Jul 2004 12:49:50 -0000 X-ASF-Spam-Status: No, hits=0.9 required=10.0 tests=MSGID_FROM_MTA_HEADER X-Spam-Check-By: apache.org Received: from [192.80.55.71] (HELO smtp-mclean.mitre.org) (192.80.55.71) by apache.org (qpsmtpd/0.27.1) with ESMTP; Thu, 29 Jul 2004 05:49:45 -0700 Received: from smtp-mclean.mitre.org (localhost.localdomain [127.0.0.1]) by smtp-mclean.mitre.org (8.11.6/8.11.6) with ESMTP id i6TCngk03922 for ; Thu, 29 Jul 2004 08:49:42 -0400 Received: from MAILHUB2 (mailhub2.mitre.org [129.83.221.18]) by smtp-mclean.mitre.org (8.11.6/8.11.6) with ESMTP id i6TCng203896 for ; Thu, 29 Jul 2004 08:49:42 -0400 Message-Id: <200407291249.i6TCng203896@smtp-mclean.mitre.org> Received: from mm110752-pc.mitre.org (129.83.68.54) by mailhub2.mitre.org with SMTP id 3888919; Thu, 29 Jul 2004 08:49:31 -0400 From: "Divya S. Jesuraj" To: "'Lucene Developers List'" Subject: RE: Powerpoint search using Lucene Date: Thu, 29 Jul 2004 08:49:31 -0400 Organization: The MITRE Corporation MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 In-reply-to: Thread-index: AcR1Hgn5l2ULHdYdQs294dPfGubXcAASykfA X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N The second link - does things a bit differently than one would expect. It creates multiple files "1.txt", "2.txt", so on, extracts the text and keeps it only in "1.txt" and doesn't save the name of the initial powerpoint file so it can't link to it when you search for it. What would be ideal is to extract the powerpoint text into an object {String?} and create a Lucene Doc that would add it to the index... I have been playing with the idea of using the code by Mr.Koundinya and somehow storing those contents to a string object which then got added as "content" to the Lucene Doc. The file name ( .ppt ) and path would get added too...will let you folks know how it goes... ~Divya -----Original Message----- From: Stephane James Vaucher [mailto:vauchers@cirano.qc.ca] Sent: Wednesday, July 28, 2004 11:41 PM To: Lucene Developers List Subject: Re: Powerpoint search using Lucene I haven't, I've found a few link though... I just saw this on the poi list. I can't confirm if it works or not (if you try it, can you tell us) http://www.mail-archive.com/poi-user@jakarta.apache.org/msg04782.html This is a reference to some code that I found works on some ppts: http://nagoya.apache.org/eyebrowse/ReadMsg?listName=poi-dev@jakarta.apache.o rg&msgNo=4326 sv On Wed, 28 Jul 2004, Divya S. Jesuraj wrote: > Hello, > > I am a VERY new Java Programmer and have now been thrust into development > using Lucene. I was able to figure out parsing/indexing of MS Word, MS > Excel, RTF, Text files, and PDFs with a lot of reading and using Poi& PDF > Sandbox. I however haven't been able to do anything with PPTs [or htmls - > that is the least of my worries]... > > I am indexing a directory on my machine and have a user interface with a > JSP. Has anyone figured out how to get a Powerpoint search to work? I > searched the forums but I can't find anything that would help my situation. > Some sample code would be appreciated. > > Thank you. > > ~Divya Jesuraj > Technical Summer Intern 2004 > MITRE Corporation > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-dev-help@jakarta.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-dev-help@jakarta.apache.org