lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Nemeskey <nemeskey.da...@sztaki.hu>
Subject Re: GSoC
Date Thu, 24 Feb 2011 11:14:49 GMT
Please find the implementation plan attached. The word "soon" gets a new 
meaning when power outages are taken into account. :)

As before, comments are welcome.

David

On Tuesday, February 22, 2011 15:22:57 Simon Willnauer wrote:
> I think that is good for now. I should get started on codeawards and
> wrap up our proposals. I hope I can do that this week.
> 
> simon
> 
> On Tue, Feb 22, 2011 at 3:16 PM, David Nemeskey
> 
> <nemeskey.david@sztaki.hu> wrote:
> > Hey,
> > 
> > I have written the proposal. Please let me know if you want more / less
> > of certain parts. Should I upload it somewhere?
> > 
> > Implementation plan soon to follow.
> > 
> > Sorry for the late reply; I have been rather busy these past few weeks.
> > 
> > David
> > 
> > On Wednesday, February 02, 2011 10:35:55 Simon Willnauer wrote:
> >> Hey David,
> >> 
> >> I saw that you added a tiny line to the GSoC Lucene wiki - thanks for
> >> that.
> >> 
> >> On Wed, Feb 2, 2011 at 10:10 AM, David Nemeskey
> >> 
> >> <nemeskey.david@sztaki.hu> wrote:
> >> > Hi guys,
> >> > 
> >> > Mark, Robert, Simon: thanks for the support! I really hope we can work
> >> > together this summer (and before that, obviously).
> >> 
> >> Same here!
> >> 
> >> > According to http://www.google-
> >> > melange.com/document/show/gsoc_program/google/gsoc2011/timeline ,
> >> > there's still some time until the application period. So let me use
> >> > this week to finish my PhD research plan, and get back to you next
> >> > week.
> >> > 
> >> > I am not really familiar with how the program works, i.e. how detailed
> >> > the application description should be, when mentorship is decided,
> >> > etc. so I guess we will have a lot to talk about. :)
> >> 
> >> so from a 10000ft view it work like this:
> >> 
> >> 1. Write up a short proposal what your idea is about
> >> 2. make it public! and publish a implementation plan - how you would
> >> want to realize your proposal. If you don't follow that 100% in the
> >> actual impl. don't worry. Its just mean to give us an idea that you
> >> know what you are doing and where you want to go. something like a 1
> >> A4 rough design doc.
> >> 3. give other people the change to apply for the same suggestion (this
> >> is how it works though)
> >> 4 Let the ASF / us assign one or more possible mentors to it
> >> 5. let us apply for a slot in GSoC (those are limited for organizations)
> >> 6. get accepted
> >> 7. rock it!
> >> 
> >> > (Actually, should we move this discussion private?)
> >> 
> >> no - we usually do everything in public except of discussion within
> >> the PMC that are meant to be private for legal reasons or similar
> >> things. Lets stick to the mailing list for all communication except
> >> you have something that should clearly not be public. This also give
> >> other contributors a chance to help and get interested in your work!!
> >> 
> >> simon
> >> 
> >> > David
> >> > 
> >> >> Hi David, honestly this sounds fantastic.
> >> >> 
> >> >> It would be great to have someone to work with us on this issue!
> >> >> 
> >> >> To date, progress is pretty slow-going (minor improvements, cleanups,
> >> >> additional stats here and there)... but we really need all the help
> >> >> we can get, especially from people who have a really good
> >> >> understanding of the various models.
> >> >> 
> >> >> In case you are interested, here are some references to discussions
> >> >> about adding more flexibility (with some prototypes etc):
> >> >> http://www.lucidimagination.com/search/document/72787e0e54f798e4/baby
> >> >> _st eps _towards_making_lucene_s_scoring_more_flexible
> >> >> https://issues.apache.org/jira/browse/LUCENE-2392
> >> >> 
> >> >> On Fri, Jan 28, 2011 at 11:32 AM, David Nemeskey
> >> >> 
> >> >> <nemeskey.david@sztaki.hu> wrote:
> >> >> > Hi all,
> >> >> > 
> >> >> > I have already sent this mail to Simon Willnauer, and he suggested
> >> >> > me to post it here for discussion.
> >> >> > 
> >> >> > I am David Nemeskey, a PhD student at the Eotvos Lorand University,
> >> >> > Budapest, Hungary. I am doing an IR-related research, and we have
> >> >> > considered using Lucene as our search engine. We were quite
> >> >> > satisfied with the speed and ease of use. However, we would like
> >> >> > to experiment with different ranking algorithms, and this is where
> >> >> > problems arise. Lucene only supports the VSM, and unfortunately
> >> >> > the ranking architecture seems to be tailored specifically to
its
> >> >> > needs.
> >> >> > 
> >> >> > I would be very much interested in revamping the ranking component
> >> >> > as a GSoC project. The following modifications should be doable
in
> >> >> > the allocated time frame:
> >> >> > - a new ranking class hierarchy, which is generic enough to allow
> >> >> > easy implementation of new weighting schemes (at least
> >> >> > bag-of-words ones), - addition of state-of-the-art ranking
> >> >> > methods, such as Okapi BM25, proximity and DFR models,
> >> >> > - configuration for ranking selection, with the old method as
> >> >> > default.
> >> >> > 
> >> >> > I believe all users of Lucene would profit from such a project.
It
> >> >> > would provide the scientific community with an even more useful
> >> >> > research aid, while regular users could benefit from superior
> >> >> > ranking results.
> >> >> > 
> >> >> > Please let me know your opinion about this proposal.
> >> > 
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: dev-help@lucene.apache.org
> >> 
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org

Mime
View raw message