lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Conrad <ccon...@vasoftware.com>
Subject Re: SF.net search system
Date Tue, 05 Jul 2005 22:27:20 GMT

On Jun 29, 2005, at 4:37 PM, Chris Lu wrote:

> How is your crawler is done?
> I saw SF.net searches several types of documents, like "People",
> "Freshmeet.net", "Site Doc". Are they all from database?

We don't crawl per se, we use triggers in the database to spool  
changes to a table which is then processed in batches.  The searches  
which are searching data stored on SourceForge.net all work that  
way.  The Freshmeat.net search, on the other hand, actually posts the  
search to Freshmeat.net.  We have no control over their search  
system, though, that search type is provided as a convenience for users.

>
> A little bit marketing here:
> I am working on an off-the-shelf product called DBSight. It's
> basically Database+Lucene+Query Display. It can do most of the things
> you mentioned(no offense to your great work). And it is scalable to
> enterprise level database. Basically it can be attached to any
> database and create a search engine rapidly.
>
> It can even support subscriber(s)+publisher mode. So instead of a
> powerful machine, you can use several ordinary computer to create a
> search farm.
>
> Configurations, inlucding Analyzers, are configurable through web UI.
>
> check out this demo:
> http://search.dbsight.com

I took a quick look at your tool and it seems pretty robust.  Perhaps  
I'll have time to fully evaluate it at some point.

Thanks,
--Chris Conrad
SourceForge.net Engineer

>
> Chris Lu
>
> On 6/29/05, David Spencer <dave-lucene-user@tropo.com> wrote:
>
>> Chris Conrad wrote:
>>
>>> I know I've been asked before for a description of how   
>>> SourceForge.net
>>> is using Lucene.  I wrote a blog entry about it and  thought people
>>> might be interested in seeing at a high level how it  was designed.
>>> Take a look at http://blog.dev.sf.net.  Any comments  are welcome.
>>>
>>
>> Thanks for the writeup and nice to see that you guys are using  
>> Lucene.
>>
>> It would be interesting to know what Analyzer you're using, and as  
>> your
>> blog entry says you're having some wierd problem, well, I suspect the
>> most common source of strange Lucene behavior is that the "wrong"
>> Analyzer is used.
>>
>> Also, out of curiosity: what's the peak load the search system gets?
>>
>> thx,
>>   Dave
>>
>>
>>>
>>> --Chris Conrad
>>> SourceForge.net Engineer
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message