lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: GSoC
Date Thu, 10 Mar 2011 12:11:35 GMT
awesome thanks!

simon

On Thu, Mar 10, 2011 at 11:54 AM, David Nemeskey
<nemeskey.david@sztaki.hu> wrote:
> Ok, I have created a new issue, LUCENE-2959 for this project. I have uploaded
> the pdfs and added the gsoc2011 and lucene-gsoc-2011 labels as well.
>
> David
>
> On 2011 March 09, Wednesday 21:58:53 Simon Willnauer wrote:
>> On Wed, Mar 9, 2011 at 5:48 PM, Grant Ingersoll <gsingers@apache.org> wrote:
>> > I think we, Lucene committers, need to identify who is willing to mentor.
>> >    In my experience, it is less than 5 hours a week.  Most of the work
>> > is done as part of the community.  Sometimes you have to be tough and
>> > fail someone (I did last year) but most of the time, if you take the
>> > time to interview the candidates up front, it is a good experience for
>> > everyone.
>>
>> count me in
>>
>> > I'd add it would be useful to have everyone put the lucene-gsoc-11 label
>> > on their issues too, that way we can quickly find the Lucene ones.
>>
>> done on at least one ;)
>>
>> simon
>>
>> > Also, feel free to label existing bugs.
>> >
>> > On Mar 9, 2011, at 2:11 AM, Simon Willnauer wrote:
>> >> Hey David and all others who want to contribute to GSoC,
>> >>
>> >> the ASF has applied for GSoC 2011 as a mentoring organization. As a
>> >> ASF project we don't need to apply directly though but we need to
>> >> register our ideas now. This works like almost anything in the ASF
>> >> through JIRA. All ideas should be recorded as JIRA tickets  labeled
>> >> with "gsoc2011". Once this is done it will show up here:
>> >> http://s.apache.org/gsoc2011tasks
>> >>
>> >> Everybody who is interested in GSoC as a mentor or student should now
>> >> read this too http://community.apache.org/gsoc.html
>> >>
>> >>
>> >> Thanks,
>> >>
>> >> Simon
>> >>
>> >>
>> >>
>> >>
>> >> On Thu, Feb 24, 2011 at 12:14 PM, David Nemeskey
>> >>
>> >> <nemeskey.david@sztaki.hu> wrote:
>> >>> Please find the implementation plan attached. The word "soon" gets a
>> >>> new meaning when power outages are taken into account. :)
>> >>>
>> >>> As before, comments are welcome.
>> >>>
>> >>> David
>> >>>
>> >>> On Tuesday, February 22, 2011 15:22:57 Simon Willnauer wrote:
>> >>>> I think that is good for now. I should get started on codeawards
and
>> >>>> wrap up our proposals. I hope I can do that this week.
>> >>>>
>> >>>> simon
>> >>>>
>> >>>> On Tue, Feb 22, 2011 at 3:16 PM, David Nemeskey
>> >>>>
>> >>>> <nemeskey.david@sztaki.hu> wrote:
>> >>>>> Hey,
>> >>>>>
>> >>>>> I have written the proposal. Please let me know if you want
more /
>> >>>>> less of certain parts. Should I upload it somewhere?
>> >>>>>
>> >>>>> Implementation plan soon to follow.
>> >>>>>
>> >>>>> Sorry for the late reply; I have been rather busy these past
few
>> >>>>> weeks.
>> >>>>>
>> >>>>> David
>> >>>>>
>> >>>>> On Wednesday, February 02, 2011 10:35:55 Simon Willnauer wrote:
>> >>>>>> Hey David,
>> >>>>>>
>> >>>>>> I saw that you added a tiny line to the GSoC Lucene wiki
- thanks
>> >>>>>> for that.
>> >>>>>>
>> >>>>>> On Wed, Feb 2, 2011 at 10:10 AM, David Nemeskey
>> >>>>>>
>> >>>>>> <nemeskey.david@sztaki.hu> wrote:
>> >>>>>>> Hi guys,
>> >>>>>>>
>> >>>>>>> Mark, Robert, Simon: thanks for the support! I really
hope we can
>> >>>>>>> work together this summer (and before that, obviously).
>> >>>>>>
>> >>>>>> Same here!
>> >>>>>>
>> >>>>>>> According to http://www.google-
>> >>>>>>> melange.com/document/show/gsoc_program/google/gsoc2011/timeline
,
>> >>>>>>> there's still some time until the application period.
So let me use
>> >>>>>>> this week to finish my PhD research plan, and get back
to you next
>> >>>>>>> week.
>> >>>>>>>
>> >>>>>>> I am not really familiar with how the program works,
i.e. how
>> >>>>>>> detailed the application description should be, when
mentorship is
>> >>>>>>> decided, etc. so I guess we will have a lot to talk
about. :)
>> >>>>>>
>> >>>>>> so from a 10000ft view it work like this:
>> >>>>>>
>> >>>>>> 1. Write up a short proposal what your idea is about
>> >>>>>> 2. make it public! and publish a implementation plan - how
you would
>> >>>>>> want to realize your proposal. If you don't follow that
100% in the
>> >>>>>> actual impl. don't worry. Its just mean to give us an idea
that you
>> >>>>>> know what you are doing and where you want to go. something
like a 1
>> >>>>>> A4 rough design doc.
>> >>>>>> 3. give other people the change to apply for the same suggestion
>> >>>>>> (this is how it works though)
>> >>>>>> 4 Let the ASF / us assign one or more possible mentors to
it
>> >>>>>> 5. let us apply for a slot in GSoC (those are limited for
>> >>>>>> organizations) 6. get accepted
>> >>>>>> 7. rock it!
>> >>>>>>
>> >>>>>>> (Actually, should we move this discussion private?)
>> >>>>>>
>> >>>>>> no - we usually do everything in public except of discussion
within
>> >>>>>> the PMC that are meant to be private for legal reasons or
similar
>> >>>>>> things. Lets stick to the mailing list for all communication
except
>> >>>>>> you have something that should clearly not be public. This
also give
>> >>>>>> other contributors a chance to help and get interested in
your
>> >>>>>> work!!
>> >>>>>>
>> >>>>>> simon
>> >>>>>>
>> >>>>>>> David
>> >>>>>>>
>> >>>>>>>> Hi David, honestly this sounds fantastic.
>> >>>>>>>>
>> >>>>>>>> It would be great to have someone to work with us
on this issue!
>> >>>>>>>>
>> >>>>>>>> To date, progress is pretty slow-going (minor improvements,
>> >>>>>>>> cleanups, additional stats here and there)... but
we really need
>> >>>>>>>> all the help we can get, especially from people
who have a really
>> >>>>>>>> good understanding of the various models.
>> >>>>>>>>
>> >>>>>>>> In case you are interested, here are some references
to
>> >>>>>>>> discussions about adding more flexibility (with
some prototypes
>> >>>>>>>> etc):
>> >>>>>>>> http://www.lucidimagination.com/search/document/72787e0e54f798e4/
>> >>>>>>>> baby _st eps _towards_making_lucene_s_scoring_more_flexible
>> >>>>>>>> https://issues.apache.org/jira/browse/LUCENE-2392
>> >>>>>>>>
>> >>>>>>>> On Fri, Jan 28, 2011 at 11:32 AM, David Nemeskey
>> >>>>>>>>
>> >>>>>>>> <nemeskey.david@sztaki.hu> wrote:
>> >>>>>>>>> Hi all,
>> >>>>>>>>>
>> >>>>>>>>> I have already sent this mail to Simon Willnauer,
and he
>> >>>>>>>>> suggested me to post it here for discussion.
>> >>>>>>>>>
>> >>>>>>>>> I am David Nemeskey, a PhD student at the Eotvos
Lorand
>> >>>>>>>>> University, Budapest, Hungary. I am doing an
IR-related
>> >>>>>>>>> research, and we have considered using Lucene
as our search
>> >>>>>>>>> engine. We were quite satisfied with the speed
and ease of use.
>> >>>>>>>>> However, we would like to experiment with different
ranking
>> >>>>>>>>> algorithms, and this is where problems arise.
Lucene only
>> >>>>>>>>> supports the VSM, and unfortunately the ranking
architecture
>> >>>>>>>>> seems to be tailored specifically to its needs.
>> >>>>>>>>>
>> >>>>>>>>> I would be very much interested in revamping
the ranking
>> >>>>>>>>> component as a GSoC project. The following modifications
should
>> >>>>>>>>> be doable in the allocated time frame:
>> >>>>>>>>> - a new ranking class hierarchy, which is generic
enough to allow
>> >>>>>>>>> easy implementation of new weighting schemes
(at least
>> >>>>>>>>> bag-of-words ones), - addition of state-of-the-art
ranking
>> >>>>>>>>> methods, such as Okapi BM25, proximity and DFR
models,
>> >>>>>>>>> - configuration for ranking selection, with
the old method as
>> >>>>>>>>> default.
>> >>>>>>>>>
>> >>>>>>>>> I believe all users of Lucene would profit from
such a project.
>> >>>>>>>>> It would provide the scientific community with
an even more
>> >>>>>>>>> useful research aid, while regular users could
benefit from
>> >>>>>>>>> superior ranking results.
>> >>>>>>>>>
>> >>>>>>>>> Please let me know your opinion about this proposal.
>> >>>>>>>
>> >>>>>>> -------------------------------------------------------------------
>> >>>>>>> -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>>>>>
>> >>>>>> --------------------------------------------------------------------
>> >>>>>> - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>>>>
>> >>>>> ---------------------------------------------------------------------
>> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>>
>> >>> ---------------------------------------------------------------------
>> >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >
>> > --------------------------
>> > Grant Ingersoll
>> > http://www.lucidimagination.com/
>> >
>> > Search the Lucene ecosystem docs using Solr/Lucene:
>> > http://www.lucidimagination.com/search
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: dev-help@lucene.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message