lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Nemeskey <nemeskey.da...@sztaki.hu>
Subject Re: GSoC
Date Thu, 10 Mar 2011 10:54:18 GMT
Ok, I have created a new issue, LUCENE-2959 for this project. I have uploaded 
the pdfs and added the gsoc2011 and lucene-gsoc-2011 labels as well.

David

On 2011 March 09, Wednesday 21:58:53 Simon Willnauer wrote:
> On Wed, Mar 9, 2011 at 5:48 PM, Grant Ingersoll <gsingers@apache.org> wrote:
> > I think we, Lucene committers, need to identify who is willing to mentor.
> >    In my experience, it is less than 5 hours a week.  Most of the work
> > is done as part of the community.  Sometimes you have to be tough and
> > fail someone (I did last year) but most of the time, if you take the
> > time to interview the candidates up front, it is a good experience for
> > everyone.
> 
> count me in
> 
> > I'd add it would be useful to have everyone put the lucene-gsoc-11 label
> > on their issues too, that way we can quickly find the Lucene ones.
> 
> done on at least one ;)
> 
> simon
> 
> > Also, feel free to label existing bugs.
> > 
> > On Mar 9, 2011, at 2:11 AM, Simon Willnauer wrote:
> >> Hey David and all others who want to contribute to GSoC,
> >> 
> >> the ASF has applied for GSoC 2011 as a mentoring organization. As a
> >> ASF project we don't need to apply directly though but we need to
> >> register our ideas now. This works like almost anything in the ASF
> >> through JIRA. All ideas should be recorded as JIRA tickets  labeled
> >> with "gsoc2011". Once this is done it will show up here:
> >> http://s.apache.org/gsoc2011tasks
> >> 
> >> Everybody who is interested in GSoC as a mentor or student should now
> >> read this too http://community.apache.org/gsoc.html
> >> 
> >> 
> >> Thanks,
> >> 
> >> Simon
> >> 
> >> 
> >> 
> >> 
> >> On Thu, Feb 24, 2011 at 12:14 PM, David Nemeskey
> >> 
> >> <nemeskey.david@sztaki.hu> wrote:
> >>> Please find the implementation plan attached. The word "soon" gets a
> >>> new meaning when power outages are taken into account. :)
> >>> 
> >>> As before, comments are welcome.
> >>> 
> >>> David
> >>> 
> >>> On Tuesday, February 22, 2011 15:22:57 Simon Willnauer wrote:
> >>>> I think that is good for now. I should get started on codeawards and
> >>>> wrap up our proposals. I hope I can do that this week.
> >>>> 
> >>>> simon
> >>>> 
> >>>> On Tue, Feb 22, 2011 at 3:16 PM, David Nemeskey
> >>>> 
> >>>> <nemeskey.david@sztaki.hu> wrote:
> >>>>> Hey,
> >>>>> 
> >>>>> I have written the proposal. Please let me know if you want more
/
> >>>>> less of certain parts. Should I upload it somewhere?
> >>>>> 
> >>>>> Implementation plan soon to follow.
> >>>>> 
> >>>>> Sorry for the late reply; I have been rather busy these past few
> >>>>> weeks.
> >>>>> 
> >>>>> David
> >>>>> 
> >>>>> On Wednesday, February 02, 2011 10:35:55 Simon Willnauer wrote:
> >>>>>> Hey David,
> >>>>>> 
> >>>>>> I saw that you added a tiny line to the GSoC Lucene wiki - thanks
> >>>>>> for that.
> >>>>>> 
> >>>>>> On Wed, Feb 2, 2011 at 10:10 AM, David Nemeskey
> >>>>>> 
> >>>>>> <nemeskey.david@sztaki.hu> wrote:
> >>>>>>> Hi guys,
> >>>>>>> 
> >>>>>>> Mark, Robert, Simon: thanks for the support! I really hope
we can
> >>>>>>> work together this summer (and before that, obviously).
> >>>>>> 
> >>>>>> Same here!
> >>>>>> 
> >>>>>>> According to http://www.google-
> >>>>>>> melange.com/document/show/gsoc_program/google/gsoc2011/timeline
,
> >>>>>>> there's still some time until the application period. So
let me use
> >>>>>>> this week to finish my PhD research plan, and get back to
you next
> >>>>>>> week.
> >>>>>>> 
> >>>>>>> I am not really familiar with how the program works, i.e.
how
> >>>>>>> detailed the application description should be, when mentorship
is
> >>>>>>> decided, etc. so I guess we will have a lot to talk about.
:)
> >>>>>> 
> >>>>>> so from a 10000ft view it work like this:
> >>>>>> 
> >>>>>> 1. Write up a short proposal what your idea is about
> >>>>>> 2. make it public! and publish a implementation plan - how you
would
> >>>>>> want to realize your proposal. If you don't follow that 100%
in the
> >>>>>> actual impl. don't worry. Its just mean to give us an idea that
you
> >>>>>> know what you are doing and where you want to go. something
like a 1
> >>>>>> A4 rough design doc.
> >>>>>> 3. give other people the change to apply for the same suggestion
> >>>>>> (this is how it works though)
> >>>>>> 4 Let the ASF / us assign one or more possible mentors to it
> >>>>>> 5. let us apply for a slot in GSoC (those are limited for
> >>>>>> organizations) 6. get accepted
> >>>>>> 7. rock it!
> >>>>>> 
> >>>>>>> (Actually, should we move this discussion private?)
> >>>>>> 
> >>>>>> no - we usually do everything in public except of discussion
within
> >>>>>> the PMC that are meant to be private for legal reasons or similar
> >>>>>> things. Lets stick to the mailing list for all communication
except
> >>>>>> you have something that should clearly not be public. This also
give
> >>>>>> other contributors a chance to help and get interested in your
> >>>>>> work!!
> >>>>>> 
> >>>>>> simon
> >>>>>> 
> >>>>>>> David
> >>>>>>> 
> >>>>>>>> Hi David, honestly this sounds fantastic.
> >>>>>>>> 
> >>>>>>>> It would be great to have someone to work with us on
this issue!
> >>>>>>>> 
> >>>>>>>> To date, progress is pretty slow-going (minor improvements,
> >>>>>>>> cleanups, additional stats here and there)... but we
really need
> >>>>>>>> all the help we can get, especially from people who
have a really
> >>>>>>>> good understanding of the various models.
> >>>>>>>> 
> >>>>>>>> In case you are interested, here are some references
to
> >>>>>>>> discussions about adding more flexibility (with some
prototypes
> >>>>>>>> etc):
> >>>>>>>> http://www.lucidimagination.com/search/document/72787e0e54f798e4/
> >>>>>>>> baby _st eps _towards_making_lucene_s_scoring_more_flexible
> >>>>>>>> https://issues.apache.org/jira/browse/LUCENE-2392
> >>>>>>>> 
> >>>>>>>> On Fri, Jan 28, 2011 at 11:32 AM, David Nemeskey
> >>>>>>>> 
> >>>>>>>> <nemeskey.david@sztaki.hu> wrote:
> >>>>>>>>> Hi all,
> >>>>>>>>> 
> >>>>>>>>> I have already sent this mail to Simon Willnauer,
and he
> >>>>>>>>> suggested me to post it here for discussion.
> >>>>>>>>> 
> >>>>>>>>> I am David Nemeskey, a PhD student at the Eotvos
Lorand
> >>>>>>>>> University, Budapest, Hungary. I am doing an IR-related
> >>>>>>>>> research, and we have considered using Lucene as
our search
> >>>>>>>>> engine. We were quite satisfied with the speed and
ease of use.
> >>>>>>>>> However, we would like to experiment with different
ranking
> >>>>>>>>> algorithms, and this is where problems arise. Lucene
only
> >>>>>>>>> supports the VSM, and unfortunately the ranking
architecture
> >>>>>>>>> seems to be tailored specifically to its needs.
> >>>>>>>>> 
> >>>>>>>>> I would be very much interested in revamping the
ranking
> >>>>>>>>> component as a GSoC project. The following modifications
should
> >>>>>>>>> be doable in the allocated time frame:
> >>>>>>>>> - a new ranking class hierarchy, which is generic
enough to allow
> >>>>>>>>> easy implementation of new weighting schemes (at
least
> >>>>>>>>> bag-of-words ones), - addition of state-of-the-art
ranking
> >>>>>>>>> methods, such as Okapi BM25, proximity and DFR models,
> >>>>>>>>> - configuration for ranking selection, with the
old method as
> >>>>>>>>> default.
> >>>>>>>>> 
> >>>>>>>>> I believe all users of Lucene would profit from
such a project.
> >>>>>>>>> It would provide the scientific community with an
even more
> >>>>>>>>> useful research aid, while regular users could benefit
from
> >>>>>>>>> superior ranking results.
> >>>>>>>>> 
> >>>>>>>>> Please let me know your opinion about this proposal.
> >>>>>>> 
> >>>>>>> -------------------------------------------------------------------
> >>>>>>> -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>> 
> >>>>>> --------------------------------------------------------------------
> >>>>>> - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>> 
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>> 
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: dev-help@lucene.apache.org
> >> 
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> > 
> > --------------------------
> > Grant Ingersoll
> > http://www.lucidimagination.com/
> > 
> > Search the Lucene ecosystem docs using Solr/Lucene:
> > http://www.lucidimagination.com/search
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message