lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maisnam Ns <maisnam...@gmail.com>
Subject Top 10 words
Date Fri, 13 Feb 2015 16:43:22 GMT
Hi,

Can someone help me with this use case:

1. I have to search a string and let's say the search engine(it is not
lucene) found this string in 100,000 documents.  I need to find the top 10
words occurring in this 100000 documents.As the document size is large how
to further index these documents and find the top 10 words

1. I am thinking of using Lucene Ramdirectory or memory indexing and find
the most occurring top 10 words.
2. Is this the right approach , indexing and writing to the disk would be
almost over kill and a user can search any number of times.

Thanks in advance.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message