Hi,I have a Lucene index contains a collection of documents. For an input string segment (s)
and a specific document in the index (d), I want to get the number of common terms between
s and d.After searching the web, I found that I can extract all terms of d using "IndexReader.getTermVector"
and also I can analyze the input segment (s) using a same analyzer. Therefore, I can find
the common terms between them.
I think it is very prohibitive and I'm looking for a speedy solution. Exactly, I want to find
the common terms between two documents in minimum cost.
Would you please help me in this task?Thanks.
|