lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benson Margulies" <bimargul...@gmail.com>
Subject Re: Entity extraction?
Date Mon, 27 Oct 2008 17:55:02 GMT
Extractors are exactly as good as the data you have to train or
configure them with. An open source extractor platform may still
require you to come up with a rather large heap of data from
somewhere.

Not all the vendors of extractors lose money.

How useful NEE is for search is an ongoing question that depends on
what sort of data you are working with and what sort of precision
challenges most concern you.


On Mon, Oct 27, 2008 at 12:34 PM, Walter Underwood
<wunderwood@netflix.com> wrote:
> Verity sold a lot of features based on "we might need it at some point."
> Very few people deployed the advanced features. They just didn't need them.
>
> wunder
>
> On 10/27/08 9:27 AM, "Charlie Jackson" <Charlie.Jackson@cision.com> wrote:
>
>> Yeah, when they first mentioned it, my initial thought was "cool, but we don't
>> need it." However, some of the higher ups in the company are saying we might
>> want it at some point, so I've been asked to look into it. I'll be sure to let
>> them know about the flaws in the concept, thanks for that info.
>>
>> ____________________________________________
>> Charlie Jackson
>> Charlie.Jackson@cision.com
>>
>>
>> -----Original Message-----
>> From: Walter Underwood [mailto:wunderwood@netflix.com]
>> Sent: Monday, October 27, 2008 11:17 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Entity extraction?
>>
>> The vendor mentioned entity extraction, but that doesn't mean you need it.
>> Entity extraction is a pretty specific technology, and it has been a
>> money-losing product at many companies for many years, going back to
>> Xerox ThingFinder well over ten years ago.
>>
>> My guess is that very few people really need entity extraction.
>>
>> Using EE for automatic taxonomy generation is even harder to get right.
>> At best, that is a way to get a starter set of categories that you can
>> edit. You will not get a production quality taxonomy automatically.
>>
>> wunder
>>
>> On 10/27/08 8:31 AM, "Charlie Jackson" <Charlie.Jackson@cision.com> wrote:
>>
>>> True, though I may be able to convince the powers that be that it's worth the
>>> investment.
>>>
>>> There are a number of open source or free tools listed on the Wikipedia entry
>>> for entity extraction
>>> (http://en.wikipedia.org/wiki/Named_entity_recognition#Open_source_or_free)
>>> --
>>> does anyone have any experience with any of these?
>>>
>>> ____________________________________________
>>> Charlie Jackson
>>> 312-873-6537
>>> Charlie.Jackson@cision.com
>>>
>>> -----Original Message-----
>>> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
>>> Sent: Monday, October 27, 2008 10:23 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Entity extraction?
>>>
>>> For the record, LingPipe is not free.  It's good, but it's not free.
>>>
>>>
>>> Otis
>>> --
>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>
>>>
>>>
>>> ----- Original Message ----
>>>> From: Rafael Rossini <rafael.rossini@gmail.com>
>>>> To: solr-user@lucene.apache.org
>>>> Sent: Friday, October 24, 2008 6:08:14 PM
>>>> Subject: Re: Entity extraction?
>>>>
>>>> Solr can do a simple facet seach like FAST, but the entity extraction
>>>> demands other tecnologies. I do not know how FAST does it but at the company
>>>> I´m working on (www.cortex-intelligence.com), we use a mix of statistical
>>>> and language-specific tasks to recognize and categorize entities in the
>>>> text. Ling Pipe is another tool (free) that does that too. In case you would
>>>> like to see a simple demo: http://www.cortex-intelligence.com/tech/
>>>>
>>>> Rossini
>>>>
>>>>
>>>> On Fri, Oct 24, 2008 at 6:18 PM, Charlie Jackson
>>>>> wrote:
>>>>
>>>>> During a recent sales pitch to my company by FAST, they mentioned entity
>>>>> extraction. I'd never heard of it before, but they described it as
>>>>> basically recognizing people/places/things in documents being indexed
>>>>> and then being able to do faceting on this data at query time. Does
>>>>> anything like this already exist in SOLR? If not, I'm not opposed to
>>>>> developing it myself, but I could use some pointers on where to start.
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> - Charlie
>>>>>
>>>>>
>>>
>>>
>>>
>>
>>
>>
>
>
Mime
View raw message