Return-Path: Delivered-To: apmail-lucene-solr-dev-archive@minotaur.apache.org Received: (qmail 90013 invoked from network); 9 Aug 2009 11:17:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 9 Aug 2009 11:17:37 -0000 Received: (qmail 54084 invoked by uid 500); 9 Aug 2009 11:17:44 -0000 Delivered-To: apmail-lucene-solr-dev-archive@lucene.apache.org Received: (qmail 53995 invoked by uid 500); 9 Aug 2009 11:17:44 -0000 Mailing-List: contact solr-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-dev@lucene.apache.org Delivered-To: mailing list solr-dev@lucene.apache.org Received: (qmail 53985 invoked by uid 99); 9 Aug 2009 11:17:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 09 Aug 2009 11:17:44 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [208.97.132.145] (HELO spunkymail-a4.g.dreamhost.com) (208.97.132.145) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 09 Aug 2009 11:17:34 +0000 Received: from [10.0.0.77] (adsl-065-013-152-164.sip.rdu.bellsouth.net [65.13.152.164]) by spunkymail-a4.g.dreamhost.com (Postfix) with ESMTP id 37EB53B9A9 for ; Sun, 9 Aug 2009 04:17:13 -0700 (PDT) Message-Id: <9DEDAA93-2829-4BF2-B02A-AD4582602134@apache.org> From: Grant Ingersoll To: solr-dev@lucene.apache.org In-Reply-To: <72218f020908090046o25bb5573s6ee6924d316489da@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v936) Subject: Re: structure preservation for solr Date: Sun, 9 Aug 2009 07:17:13 -0400 References: <72218f020908090046o25bb5573s6ee6924d316489da@mail.gmail.com> X-Mailer: Apple Mail (2.936) X-Virus-Checked: Checked by ClamAV on apache.org On Aug 9, 2009, at 3:46 AM, swamynathan wrote: > hi, > im swamynathan a computer science engineering studying in jaya engg > college > which is under anna univercity,chennai,India > as a part of my curriculam in the final year i need to do a proj. > i spoke with some solr users and programmers and found out that all > content > that are indexed to it are stored in a plain text and the structure > is not > preserver(as in the heading,bold,underlined all have same preference) > i was thing if i can do some modification in the parser or write a > code that > would hook and add preference to the heading,bold,italics etc > for my final year proj. > Can you expand on this a little bit? What would this structure bring to Solr? Is it for search? Or just for storage? Lucene does have the capability to store raw bytes, but Solr doesn't really have a mechanism to take advantage of this (except maybe Solr Cell)