lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prakash Dubey <prakashdube...@gmail.com>
Subject Re: Indexing Text File By Sections In Lucene
Date Thu, 04 Sep 2014 06:50:53 GMT
Hello Sunil,

You can use XML to differentiate the different section of text file and you
can use *Field* to store different section of document. While indexing the
document it will be indexed by sections. And you can query according to the
requirement.

Hope this help.

Thank you
Prakash Kumar Dubey


On Thu, Sep 4, 2014 at 11:39 AM, sunilragidi <sunilragidi@gmail.com> wrote:

> Hi, I have a requirement in which I have to index a text file using Lucene.
>
> The text file data if from a PDF file. I have used Tika to extract text
> from
> PDF and put it into the text file.
>
> I want to index the text file in the following way.
>
>     1. I don't want to index the whole text file content.
>     2. I don't want to index sentence by sentence.
>     3. Instead, I want to index the text file by sections.(The text file is
> huge)
>
> How can I do this? Any help would be greatly appreciated.
>
> --Sunil
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Indexing-Text-File-By-Sections-In-Lucene-tp4156843.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message