lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Nemeskey <nemeskey.da...@sztaki.hu>
Subject Re: GSoC
Date Tue, 22 Feb 2011 14:16:25 GMT
Hey,

I have written the proposal. Please let me know if you want more / less of 
certain parts. Should I upload it somewhere?

Implementation plan soon to follow.

Sorry for the late reply; I have been rather busy these past few weeks.

David

On Wednesday, February 02, 2011 10:35:55 Simon Willnauer wrote:
> Hey David,
> 
> I saw that you added a tiny line to the GSoC Lucene wiki - thanks for that.
> 
> On Wed, Feb 2, 2011 at 10:10 AM, David Nemeskey
> 
> <nemeskey.david@sztaki.hu> wrote:
> > Hi guys,
> > 
> > Mark, Robert, Simon: thanks for the support! I really hope we can work
> > together this summer (and before that, obviously).
> 
> Same here!
> 
> > According to http://www.google-
> > melange.com/document/show/gsoc_program/google/gsoc2011/timeline , there's
> > still some time until the application period. So let me use this week to
> > finish my PhD research plan, and get back to you next week.
> > 
> > I am not really familiar with how the program works, i.e. how detailed
> > the application description should be, when mentorship is decided, etc.
> > so I guess we will have a lot to talk about. :)
> 
> so from a 10000ft view it work like this:
> 
> 1. Write up a short proposal what your idea is about
> 2. make it public! and publish a implementation plan - how you would
> want to realize your proposal. If you don't follow that 100% in the
> actual impl. don't worry. Its just mean to give us an idea that you
> know what you are doing and where you want to go. something like a 1
> A4 rough design doc.
> 3. give other people the change to apply for the same suggestion (this
> is how it works though)
> 4 Let the ASF / us assign one or more possible mentors to it
> 5. let us apply for a slot in GSoC (those are limited for organizations)
> 6. get accepted
> 7. rock it!
> 
> > (Actually, should we move this discussion private?)
> 
> no - we usually do everything in public except of discussion within
> the PMC that are meant to be private for legal reasons or similar
> things. Lets stick to the mailing list for all communication except
> you have something that should clearly not be public. This also give
> other contributors a chance to help and get interested in your work!!
> 
> simon
> 
> > David
> > 
> >> Hi David, honestly this sounds fantastic.
> >> 
> >> It would be great to have someone to work with us on this issue!
> >> 
> >> To date, progress is pretty slow-going (minor improvements, cleanups,
> >> additional stats here and there)... but we really need all the help we
> >> can get, especially from people who have a really good understanding
> >> of the various models.
> >> 
> >> In case you are interested, here are some references to discussions
> >> about adding more flexibility (with some prototypes etc):
> >> http://www.lucidimagination.com/search/document/72787e0e54f798e4/baby_st
> >> eps _towards_making_lucene_s_scoring_more_flexible
> >> https://issues.apache.org/jira/browse/LUCENE-2392
> >> 
> >> On Fri, Jan 28, 2011 at 11:32 AM, David Nemeskey
> >> 
> >> <nemeskey.david@sztaki.hu> wrote:
> >> > Hi all,
> >> > 
> >> > I have already sent this mail to Simon Willnauer, and he suggested me
> >> > to post it here for discussion.
> >> > 
> >> > I am David Nemeskey, a PhD student at the Eotvos Lorand University,
> >> > Budapest, Hungary. I am doing an IR-related research, and we have
> >> > considered using Lucene as our search engine. We were quite satisfied
> >> > with the speed and ease of use. However, we would like to experiment
> >> > with different ranking algorithms, and this is where problems arise.
> >> > Lucene only supports the VSM, and unfortunately the ranking
> >> > architecture seems to be tailored specifically to its needs.
> >> > 
> >> > I would be very much interested in revamping the ranking component as
> >> > a GSoC project. The following modifications should be doable in the
> >> > allocated time frame:
> >> > - a new ranking class hierarchy, which is generic enough to allow easy
> >> > implementation of new weighting schemes (at least bag-of-words ones),
> >> > - addition of state-of-the-art ranking methods, such as Okapi BM25,
> >> > proximity and DFR models,
> >> > - configuration for ranking selection, with the old method as default.
> >> > 
> >> > I believe all users of Lucene would profit from such a project. It
> >> > would provide the scientific community with an even more useful
> >> > research aid, while regular users could benefit from superior ranking
> >> > results.
> >> > 
> >> > Please let me know your opinion about this proposal.
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org

Mime
View raw message