lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <>
Subject Re: GSoC
Date Wed, 02 Feb 2011 09:35:55 GMT
Hey David,

I saw that you added a tiny line to the GSoC Lucene wiki - thanks for that.

On Wed, Feb 2, 2011 at 10:10 AM, David Nemeskey
<> wrote:
> Hi guys,
> Mark, Robert, Simon: thanks for the support! I really hope we can work
> together this summer (and before that, obviously).
Same here!
> According to
> , there's
> still some time until the application period. So let me use this week to finish
> my PhD research plan, and get back to you next week.
> I am not really familiar with how the program works, i.e. how detailed the
> application description should be, when mentorship is decided, etc. so I guess
> we will have a lot to talk about. :)

so from a 10000ft view it work like this:

1. Write up a short proposal what your idea is about
2. make it public! and publish a implementation plan - how you would
want to realize your proposal. If you don't follow that 100% in the
actual impl. don't worry. Its just mean to give us an idea that you
know what you are doing and where you want to go. something like a 1
A4 rough design doc.
3. give other people the change to apply for the same suggestion (this
is how it works though)
4 Let the ASF / us assign one or more possible mentors to it
5. let us apply for a slot in GSoC (those are limited for organizations)
6. get accepted
7. rock it!

> (Actually, should we move this discussion private?)
no - we usually do everything in public except of discussion within
the PMC that are meant to be private for legal reasons or similar
things. Lets stick to the mailing list for all communication except
you have something that should clearly not be public. This also give
other contributors a chance to help and get interested in your work!!

> David
>> Hi David, honestly this sounds fantastic.
>> It would be great to have someone to work with us on this issue!
>> To date, progress is pretty slow-going (minor improvements, cleanups,
>> additional stats here and there)... but we really need all the help we
>> can get, especially from people who have a really good understanding
>> of the various models.
>> In case you are interested, here are some references to discussions
>> about adding more flexibility (with some prototypes etc):
>> _towards_making_lucene_s_scoring_more_flexible
>> On Fri, Jan 28, 2011 at 11:32 AM, David Nemeskey
>> <> wrote:
>> > Hi all,
>> >
>> > I have already sent this mail to Simon Willnauer, and he suggested me to
>> > post it here for discussion.
>> >
>> > I am David Nemeskey, a PhD student at the Eotvos Lorand University,
>> > Budapest, Hungary. I am doing an IR-related research, and we have
>> > considered using Lucene as our search engine. We were quite satisfied
>> > with the speed and ease of use. However, we would like to experiment
>> > with different ranking algorithms, and this is where problems arise.
>> > Lucene only supports the VSM, and unfortunately the ranking architecture
>> > seems to be tailored specifically to its needs.
>> >
>> > I would be very much interested in revamping the ranking component as a
>> > GSoC project. The following modifications should be doable in the
>> > allocated time frame:
>> > - a new ranking class hierarchy, which is generic enough to allow easy
>> > implementation of new weighting schemes (at least bag-of-words ones),
>> > - addition of state-of-the-art ranking methods, such as Okapi BM25,
>> > proximity and DFR models,
>> > - configuration for ranking selection, with the old method as default.
>> >
>> > I believe all users of Lucene would profit from such a project. It would
>> > provide the scientific community with an even more useful research aid,
>> > while regular users could benefit from superior ranking results.
>> >
>> > Please let me know your opinion about this proposal.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message