incubator-bloodhound-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olemis Lang <ole...@gmail.com>
Subject Re: Need advice: stripping of wiki syntax from Bloodhound Search results
Date Thu, 14 Feb 2013 17:54:30 GMT
On 2/14/13, Matevž Bradač <matevzb@gmail.com> wrote:
> On 14. Feb, 2013, at 15:33, Gary Martin wrote:
>> On 14/02/13 12:18, Andrej Golcov wrote:
>
>>> HI,
>>>

:)

>>> Branko suggested to strip wiki syntax from Bloodhound Search results
>>> that is IMHO quite reasonable suggestion. That featue will give us
>>> better search scoring and better highlighting.
>>>
>>> I have one question regarding implementation of this feature.
>>> I far as I can see, existing formatters (e.g. trac.wiki.formatter.*
>>> classes) provide wiki to html formatting but not wiki to stripped
>>> text. Do I missed something?
>>>

Well now I see that Trac search handler emits formatted text (i.e.
highlights search keywords using `.searchword#` classes) for this
purpose . I also noticed that we don't highlight those as we removed
Trac css in theme plugin .

>>> One of the possibility, that I see, is to convert wiki to html and
>>> than convert html to text. That does not look like the most optimal
>>> solution.
>>>
>>> Any alternatives, ideas?

Could you figure out how it does such a thing ?

>>
>> I should find out more about how the formatters work! My first thought
>> would be to look at creating a new formatter that strips out syntax but I
>> am not sure how big a job that will be.
>>

should be similar to link extraction formatter , but instead of
processing links and ignoring everything else , just process text and
ignore everything else .

[...]
>
> Since the formatters return a Markup, perhaps a quick workaround would be to
> use
> something like Markup's stripentities() and striptags() to get the "raw"
> text back?
>

afaict this should work too .

-- 
Regards,

Olemis.

Mime
View raw message