lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gupta, Rajiv" <>
Subject [lucy-user] Doc id from hits and remove redundant documents
Date Wed, 23 Nov 2016 14:33:41 GMT

I have 2 questions.

1.       Which field I use to get the document id from hits:
  my $hits = $searcher->hits(
      query      => $query_parsed,
      num_wanted => -1, # -1 equivlent to all results
while (my $hits $hits->next()){
                print "Docment id: " . $hit->{???};

2.       While inserting records how can avoid inserting duplicate records.
Somehow in my process the same file is reopening again multiple times and each time it starts
indexing from beginning of the file. So initially it added few documents and file closed after
some time some more content added to the file and I reopen the file now the same set of documents
added again along with additional content, instead I want that it should only add documents
for new additions in the file. I cannot use truncate as there are other files documents will
also get impacted, which are present in same folders.

Thanks much!

Rajiv Gupta

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message