lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shivashankar Maddanimath <shivashankar.maddanim...@yahoo.in>
Subject Can we configure analyzers to not exclude specific characters
Date Wed, 28 Jan 2015 07:00:45 GMT
Hi,

I am using  Lucene standard and uax29urlemailtokenizer. These analysers are excluding some
characters like "+" ( I can't search C++). Is there any way we can  configure analyzers to
include specific characters in analyzers while tokenising?

Regards,
Shiv

-----Original Message-----
From: "Luis A Lastras" <lastrasl@us.ibm.com>
Sent: ‎25-‎01-‎2015 08:05 AM
To: "java-user@lucene.apache.org" <java-user@lucene.apache.org>
Subject: Absolute term position in scoring

Is it possible to incorporate in Lucene's scoring function the position of a matching term
(say as measured from the top of the document). The scenario is, if the set of documents tend
to lk about the most important stuff at the beginning of the document, then we would like
to give preference to documents that mention a term close to the top.

Thanks,

Luis





Luis A Lastras, Ph.D.
Research Staff Member & Manager, Concept Analytics, IBM Watson
Member of the iBM Academy of Technology
IBM Master Inventor
email: lastrasl@us.ibm.com | Tel: 914-945-3613 | Cell: 914-382-1879
address:  1101 Kitchawan Rd, Office 28-132, Yorktown Heights, NY, 10598
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message