incubator-bloodhound-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Bloodhound" <bloodhound-...@incubator.apache.org>
Subject Re: [Apache Bloodhound] #389: Strip wiki formatting from the Bloodhound Search results
Date Mon, 18 Feb 2013 09:09:11 GMT
#389: Strip wiki formatting from the Bloodhound Search results
-------------------------+-------------------------------------------------
  Reporter:  andrej      |      Owner:  andrej
      Type:              |     Status:  assigned
  enhancement            |  Milestone:  Release 5
  Priority:  major       |    Version:
 Component:  search      |   Keywords:  search bep-0004 bhsearch
Resolution:              |  bep-0004-beta
-------------------------+-------------------------------------------------

Comment (by andrej):

 The primary source for indexing is DB. I we would need more data from wiki
 markup, we can just reindex DB and add more fields. As alternative we can
 store (not indexed) complete wiki fields but index and search stripped
 version.

 I suggest we proceed with index time stripping and change this if we will
 see any drawbacks. We can re-index things on new features. What do you
 think?

 Replying to [comment:9 olemis]:
 > Replying to [comment:4 jdreimann]:
 > > Wouldn't this mean that we lose the information provided by wiki
 formatting to rank results later? For example if a word appears styled as
 a heading via wiki formatting it probably has a higher score then a work
 that appears in a cell in a table (again via wiki formatting).

-- 
Ticket URL: <https://issues.apache.org/bloodhound/ticket/389#comment:10>
Apache Bloodhound <https://issues.apache.org/bloodhound/>
The Apache Bloodhound (incubating) issue tracker

Mime
View raw message