Return-Path: X-Original-To: apmail-incubator-bloodhound-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-bloodhound-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3F47ADD4D for ; Thu, 14 Feb 2013 14:55:10 +0000 (UTC) Received: (qmail 52740 invoked by uid 500); 14 Feb 2013 14:55:10 -0000 Delivered-To: apmail-incubator-bloodhound-dev-archive@incubator.apache.org Received: (qmail 52655 invoked by uid 500); 14 Feb 2013 14:55:09 -0000 Mailing-List: contact bloodhound-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: bloodhound-dev@incubator.apache.org Delivered-To: mailing list bloodhound-dev@incubator.apache.org Received: (qmail 52634 invoked by uid 99); 14 Feb 2013 14:55:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Feb 2013 14:55:08 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of matevzb@gmail.com designates 209.85.214.41 as permitted sender) Received: from [209.85.214.41] (HELO mail-bk0-f41.google.com) (209.85.214.41) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Feb 2013 14:55:00 +0000 Received: by mail-bk0-f41.google.com with SMTP id q16so1104912bkw.14 for ; Thu, 14 Feb 2013 06:54:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; bh=Lhp7Q5+NR6rQnmm4co6e1nkp81ClxNpBtC1E+MC4ZvI=; b=KEUGwnCsV+VPX9b53xPAEkXn2tpo8B1dd1mdwS1SQz61+KtxOAlCO5Oa34/wSByuDW njKaEUlvNNYzHr7S3UIOn+5gc18ZtQzj0hVKJl6fl/0cewPP0JjYHRM3QFfcgXDF7sW6 pgp4PHHHEE8wQHkNT2MvyF4mpLqC4VXkxECfapNM/V5q/96NxMHSQThB9dyCM3XYGa+t tFpw7Gwkb7EbQmCccT0vwcTVS28WWJp7c9J87BorAo0DVxqLSTTvGS/XAGr+v47joCTx aYSK4L3zw6qO20ZSYQW9YilzuE88aiUP2R7Uv+fJf/j5MLitlHznoPrGvY0jSllyW8Nn iNMQ== X-Received: by 10.204.153.1 with SMTP id i1mr4727249bkw.20.1360853679248; Thu, 14 Feb 2013 06:54:39 -0800 (PST) Received: from [172.17.1.24] ([77.234.149.122]) by mx.google.com with ESMTPS id v2sm4288419bkw.5.2013.02.14.06.54.37 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 14 Feb 2013 06:54:37 -0800 (PST) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Apple Message framework v1283) Subject: Re: Need advice: stripping of wiki syntax from Bloodhound Search results From: =?utf-8?Q?Matev=C5=BE_Brada=C4=8D?= In-Reply-To: <511CF5D7.7050009@wandisco.com> Date: Thu, 14 Feb 2013 15:54:35 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <9F76E786-1D61-4471-B894-60A1E2EF5D4B@gmail.com> References: <511CF5D7.7050009@wandisco.com> To: bloodhound-dev@incubator.apache.org X-Mailer: Apple Mail (2.1283) X-Virus-Checked: Checked by ClamAV on apache.org Since the formatters return a Markup, perhaps a quick workaround would = be to use something like Markup's stripentities() and striptags() to get the "raw" = text back? -- matevz On 14. Feb, 2013, at 15:33, Gary Martin wrote: > On 14/02/13 12:18, Andrej Golcov wrote: >> HI, >>=20 >> Branko suggested to strip wiki syntax from Bloodhound Search results >> that is IMHO quite reasonable suggestion. That featue will give us >> better search scoring and better highlighting. >>=20 >> I have one question regarding implementation of this feature. >> I far as I can see, existing formatters (e.g. trac.wiki.formatter.* >> classes) provide wiki to html formatting but not wiki to stripped >> text. Do I missed something? >>=20 >> One of the possibility, that I see, is to convert wiki to html and >> than convert html to text. That does not look like the most optimal >> solution. >>=20 >> Any alternatives, ideas? >>=20 >> Regards, Andrej >=20 > I should find out more about how the formatters work! My first thought = would be to look at creating a new formatter that strips out syntax but = I am not sure how big a job that will be. >=20 > I don't mind seeing a sub-optimal solution, particularly if it is = likely to be quick enough and quick to create. I think that the double = conversion would be giving us the correct results - the rendered html = must be considered what the user will want to be able to search, right? >=20 > Cheers, > Gary