lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Rojo <BryanR...@elliottelectric.com>
Subject SERIOUS issues with PerFieldAnalyzerWrapper in 4.8
Date Sun, 15 Jul 2018 01:06:07 GMT
Hi,

Not necessarily a bug, but for some people who use PerFieldAnalyzerWrapper like I do this
might be worth noting.

PerFieldAnalyzerWrapper has been "improved" in 4.8 and now uses a PER_FIELD_REUSE_STRATEGY
which means that the tokenized fields will be stored in a dictionary, so If you have multiple
fields with the same name in your document, then you will only be able to index the very first
one that makes it into that dictionary.

So the problem with this is that you can potentially lose thousands of terms in your index,
which could cause your searches to be of very low quality.

BEWARE.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message