Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 50A7D7105 for ; Sat, 17 Dec 2011 04:52:17 +0000 (UTC) Received: (qmail 4521 invoked by uid 500); 17 Dec 2011 04:52:14 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 4392 invoked by uid 500); 17 Dec 2011 04:52:13 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 4365 invoked by uid 99); 17 Dec 2011 04:52:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Dec 2011 04:52:10 +0000 X-ASF-Spam-Status: No, hits=4.7 required=5.0 tests=FREEMAIL_FORGED_REPLYTO,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [98.139.91.72] (HELO nm2.bullet.mail.sp2.yahoo.com) (98.139.91.72) by apache.org (qpsmtpd/0.29) with SMTP; Sat, 17 Dec 2011 04:52:00 +0000 Received: from [98.139.91.65] by nm2.bullet.mail.sp2.yahoo.com with NNFMP; 17 Dec 2011 04:51:39 -0000 Received: from [98.139.91.48] by tm5.bullet.mail.sp2.yahoo.com with NNFMP; 17 Dec 2011 04:51:39 -0000 Received: from [127.0.0.1] by omp1048.mail.sp2.yahoo.com with NNFMP; 17 Dec 2011 04:51:39 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 133679.97297.bm@omp1048.mail.sp2.yahoo.com Received: (qmail 58226 invoked by uid 60001); 17 Dec 2011 04:51:38 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1324097498; bh=Xz8ayeZ3OpBBNt/BALgw/LcfbsOU5wvdXBnISYDrYWQ=; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=hOcEBL960iyq16FjrftruIUCAvmBe79gOk6tsxO1t5eRAoOOZb4XsjcHX/Jq5Idem3Ckx1Z+mSK5rNBGCci1aZgBNzMoI+g3VTjyY7l0UQUsoGNqTz7Z2DsjoVVkbMAfFrBuzQHLFqI4R6IVzKYecPmwe3J98Ul8vjbpTjd6cko= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=CwssDD/BRS4TuZQYFGvcFPwHY3c/AG/ltn580ad03fSXYseNAQEES0qmLkkwUufabM0g71HNrL8JLJZn7KHnF1nMbol1s3UgHl60FopzkYQE7xHq9YDp9/v5I7Csp2saQAT7AfN+Dfi4OmLyDWHK7fB8EeGyJv1pVEjuhF/N2/M=; X-YMail-OSG: 9BrIhvIVM1m0Lt4WEzjw9R2.1RHvkM9drw2ATocbekZYeVJ eUEifngjw7vi0i3Mq4R6VEq0wtw1pauPgyeDaUryALml.iugzNSYX.s_vvXB CTe1aDi7NzYpy5LSvdrKKCOy1omMukS4oF35YwzAAQFYIjgf3.bVo2VRnsdE ePgyiEv.QQJ8ibajGTsGfUfJaX9jrL7scEh9mhSvuY4mGuF9uQ8S8CI9wUVL HEJHwVQSSSNB4IThKTMH1kWeejG_zwsnppBwmMKYraCfOZwIdcWgQe14VkGI BRtCYGCEmZrZv7pPHzt9e9uNmJTMyUAv6qb3Lv.yzToGw1MOmx4EJSxz_FWO 7kl2HBzCsAJ3IeKaUfBeTFf1IaEmIu6CRQn4VDLkrWnAlMjV__xTjWZlp855 OGPSc7am2Am.SdH3bWYo0F0kG5TaeDt4KPwbxWcVOfxWEHD2jV9O3kszc5h1 N39n9DvKN9nPMGd91._A1cISmffi2VYUEqsrAbwwfC0CItCLhVCshdzjoBmW Ca3.fy2501HfC1e5Q2g.LjsGOnJjHcshGqz7jJL4Pu.BczDjCOwm768Zw6t. smVSfPU3Mq52mifRDIMhThwLnrLyO Received: from [74.73.25.254] by web130105.mail.mud.yahoo.com via HTTP; Fri, 16 Dec 2011 20:51:38 PST X-Mailer: YahooMailWebService/0.8.115.331698 References: Message-ID: <1324097498.33386.YahooMailNeo@web130105.mail.mud.yahoo.com> Date: Fri, 16 Dec 2011 20:51:38 -0800 (PST) From: Otis Gospodnetic Reply-To: Otis Gospodnetic Subject: Re: Retrieving Documents To: "solr-user@lucene.apache.org" In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="1819627952-9528129-1324097498=:33386" --1819627952-9528129-1324097498=:33386 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Hi Dan,=0A=0A1) Are you looking for=A0http://wiki.apache.org/solr/Highlight= ingParameters#hl.fragsize=A0?=0A=0A2)=A0Hundreds of words in a field should= not be a problem for highlighting. =A0But it sounds like this long field m= ay contain content that corresponds to N different pages in a publication a= nd you would like to inform the searcher which page the match was on, and n= ot just that a match was somewhere in that big piece of text. =A0One way to= deal with that is to break your document into N smaller documents - one do= cument for each page.=0A=0AOtis=0A----=0A=0APerformance Monitoring SaaS for= Solr - http://sematext.com/spm/solr-performance-monitoring/index.html=0A= =0A=0A=0A>________________________________=0A> From: Dan McGinn-Combs =0A>To: solr-user@lucene.apache.org =0A>Sent: Friday, Decembe= r 16, 2011 4:33 PM=0A>Subject: Retrieving Documents=0A> =0A>I've been doing= a fair amount of reading and experimenting with Solr=0A>lately. I find tha= t it does a good job of indexing very structured=0A>documents. However, the= application I have in mind is build around=0A>long EPUB documents.=0A>=0A>= Of course, I found the Extract components useful for indexing the=0A>EPUBs.= However, I would like to be able to=0A>=0A>* Size the "highlight" portion = of text around the query parameters=0A>(i.e. show 20 or 30 words) and=0A>= =0A>* Retrieve a location within the document so I can display that "page"= =0A>from the EPUB.=0A>=0A>What is common practice for these? I notice that = if I have a list of=0A>(short) text segments in fields, they are stored wit= hout too much fuss=0A>and are retrievable. However, I'm talking about a fie= ld of potentially=0A>hundreds of words.=0A>=0A>Thanks for any pointers,=0A>= Dan=0A>=0A>-- =0A>Dan McGinn-Combs=0A>dgcombs@gmail.com=0A>Peachtree City, = Georgia USA=0A>=0A>=0A> --1819627952-9528129-1324097498=:33386--