lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Serba <ase...@gmail.com>
Subject Re: DIH render html entities
Date Wed, 01 Jun 2011 23:24:51 GMT
Maybe HTMLStripTransformer is what you are looking for.

* http://wiki.apache.org/solr/DataImportHandler#HTMLStripTransformer

On Tue, May 31, 2011 at 5:35 PM, Erick Erickson <erickerickson@gmail.com> wrote:
> Convert them to what? Individual fields in your docs? Text?
>
> If the former, you might get some joy from the XpathEntityProcessor.
> If you want to just strip the markup and index all the content you
> might get some joy from the various *html* analyzers listed here:
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
>
> Best
> Erick
>
> On Fri, May 27, 2011 at 5:19 AM, anass talby <anass.talby@gmail.com> wrote:
>> Sorry my question was not clear.
>> when I get data from database, some field contains some html special chars,
>> and what i want to do is just convert them automatically.
>>
>> On Fri, May 27, 2011 at 1:00 PM, Gora Mohanty <gora@mimirtech.com> wrote:
>>
>>> On Fri, May 27, 2011 at 3:50 PM, anass talby <anass.talby@gmail.com>
>>> wrote:
>>> > Is there any way to render html entities in DIH for a specific field?
>>> [...]
>>>
>>> This does not make too much sense: What do you mean by
>>> "rendering HTML entities". DIH just indexes, so where would
>>> it render HTML to, even if it could?
>>>
>>> Please take a look at http://wiki.apache.org/solr/UsingMailingLists
>>>
>>> Regards,
>>> Gora
>>>
>>
>>
>>
>> --
>>       Anass
>>
>

Mime
View raw message