Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 40404 invoked from network); 18 Feb 2011 18:30:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 18 Feb 2011 18:30:53 -0000 Received: (qmail 49766 invoked by uid 500); 18 Feb 2011 18:30:51 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 49402 invoked by uid 500); 18 Feb 2011 18:30:48 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 49394 invoked by uid 99); 18 Feb 2011 18:30:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Feb 2011 18:30:47 +0000 X-ASF-Spam-Status: No, hits=3.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ee07b381@gmail.com designates 209.85.160.176 as permitted sender) Received: from [209.85.160.176] (HELO mail-gy0-f176.google.com) (209.85.160.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Feb 2011 18:30:40 +0000 Received: by gyf1 with SMTP id 1so1869440gyf.35 for ; Fri, 18 Feb 2011 10:30:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:from:date:message-id:subject:to :content-type; bh=MXGv/5CIc++J1VTMGPPtkitTnpv+4ONBgUwqFoVMtHs=; b=hIsyUjEMfw13qipzYSAb3IgDnQrY+UD3Y3D9dA8Sjo7XprPrUW1g8dC58NOyHuzFD2 IyoRAxicC79brGUUxQ9rvi/IajYqFEwHqOJvg4nIQnTow2uNG1GzVGgi8oeOOqXwnkWP Lz4dwOqoRCJg4Z7P+xjgeZVZummpCKNC2Sn9s= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; b=gQZ1yR6eJdU+xCkIWrVg2+TBHiTFKJMGqlnWQ1hzosvDW68ISHxQxaOF7py/+UwxT0 YZ1oTX2jXBkF+8zqgRVKB8V9bqDifnYefPEaUAgDht3a72JqRqKLj8HryQt8VYIOjSSA DNIu8dNSqApCvWXafqvgXetiVGbEJQp+gvYK4= Received: by 10.150.195.3 with SMTP id s3mr1483018ybf.274.1298053819095; Fri, 18 Feb 2011 10:30:19 -0800 (PST) MIME-Version: 1.0 Received: by 10.150.135.4 with HTTP; Fri, 18 Feb 2011 10:29:59 -0800 (PST) From: Gong Li Date: Sat, 19 Feb 2011 02:29:59 +0800 Message-ID: Subject: Lucene: If I have picture, table, or somthing others in the PDF To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=000e0cd48470c959ba049c92b419 X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd48470c959ba049c92b419 Content-Type: text/plain; charset=ISO-8859-1 Hi, I am developing a PDF search engine, locally. I have used API: pdfbox and lucene. I must show the user the PDF page containing the keywords(if highlight, it's great) and sort by relevance(default in lucene). HOW??? Maybe, if there are some pictures in the PDF page, how could it display to the user after index and search the extracted text??? Thanks --000e0cd48470c959ba049c92b419--