lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Murat Yakici" <>
Subject Re: [VOTE] Make the Open Relevance Project (ORP) and official Lucene subproject
Date Fri, 29 May 2009 20:39:53 GMT

I hope it's not too late to vote and my vote counts.

Some comments on the collection side. Why not use US/EU patent collection?
I guess it is freely available, or am I wrong? Or at least it could be
licensed with  a less restrictive licence from some place??? It is not the
biggest but may be a good one to have.

Some reasons to have such collection (if can be acquired) which might
spark some lights in your head:

1) Technical-> Content statistics are completely different than any other
collections, term distributions etc. May require specific parsers,
tokenizer implementations.

2) Multi-language content (from national patents offices)

3) It's got socio-economic benefits both for the enterprise and
inventors/creators/lawyers etc. If inventors can find more relevant
documents, the better they can prepare their patent app etc. etc. Not to
mention the patent offices, patent attorneys. Lucrative ;)

4) It's not hard to find expert judgements and maintain a user group which
could really focus and give devotion to generate relevance judgements
(compared to a nonsense, old news collection).


Murat Yakici
Department of Computer & Information Sciences
University of Strathclyde
Glasgow, UK
The University of Strathclyde is a charitable body, registered in Scotland,
with registration number SC015263.

> I'd like to call a vote on adding the ORP as an official Lucene
> subproject per the proposal at
>   with the committers specified on the Wiki page.
> [] +1 - Yes, I love it
> [] 0 - I don't care
> [] -1 - I don't love it
> Thanks,
> Grant

View raw message