Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 35538 invoked from network); 9 Mar 2011 07:12:07 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 9 Mar 2011 07:12:07 -0000 Received: (qmail 85745 invoked by uid 500); 9 Mar 2011 07:12:06 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 85700 invoked by uid 500); 9 Mar 2011 07:12:06 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 85693 invoked by uid 99); 9 Mar 2011 07:12:06 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Mar 2011 07:12:06 +0000 X-ASF-Spam-Status: No, hits=2.1 required=5.0 tests=FREEMAIL_FROM,FREEMAIL_REPLYTO,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of simon.willnauer@googlemail.com designates 209.85.212.48 as permitted sender) Received: from [209.85.212.48] (HELO mail-vw0-f48.google.com) (209.85.212.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Mar 2011 07:12:00 +0000 Received: by vws20 with SMTP id 20so276532vws.35 for ; Tue, 08 Mar 2011 23:11:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:mime-version:reply-to:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=3udCKQyQVNddgupjkzCfj4QXhnUbM7SxPxXfB14j90w=; b=bP61vfLwrl0lfYmILgzpOFtdlDbjSUSejpD0TxfogD2hnaVeJVXMtOc7EA51O2OnHo zFITy3Eht5yiOIp3VMjDYpyHggTbrl3Vug3T8wQxnXZLrBzMtlIujl8vmnai4H6v8I4+ e0bwnYXR3O4GW1VNRPHLNt8/V9dzc18E/jR0g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; b=S/tCHUAiI+fBeAryUw5f5FYLD9jljQLkYbuST/uHT6EPECizMkj93H7Nx6+S3XtxsR bJggHDmOlQ7SfTJwFJUqJj9KI2sIhOiMEgn3yGLynmvZTaQJjpn/d/1mD4obzj9axPOP AKCpATroJa5NiqOjoRKQWxF88Fkbd7fIy0Ajc= MIME-Version: 1.0 Received: by 10.52.177.9 with SMTP id cm9mr8884137vdc.91.1299654699387; Tue, 08 Mar 2011 23:11:39 -0800 (PST) Received: by 10.52.168.194 with HTTP; Tue, 8 Mar 2011 23:11:39 -0800 (PST) Reply-To: simon.willnauer@gmail.com In-Reply-To: <201102241214.49420.nemeskey.david@sztaki.hu> References: <201101281732.42681.nemeskey.david@sztaki.hu> <201102221516.25419.nemeskey.david@sztaki.hu> <201102241214.49420.nemeskey.david@sztaki.hu> Date: Wed, 9 Mar 2011 08:11:39 +0100 Message-ID: Subject: Re: GSoC From: Simon Willnauer To: dev@lucene.apache.org Cc: David Nemeskey Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org Hey David and all others who want to contribute to GSoC, the ASF has applied for GSoC 2011 as a mentoring organization. As a ASF project we don't need to apply directly though but we need to register our ideas now. This works like almost anything in the ASF through JIRA. All ideas should be recorded as JIRA tickets labeled with "gsoc2011". Once this is done it will show up here: http://s.apache.org/gsoc2011tasks Everybody who is interested in GSoC as a mentor or student should now read this too http://community.apache.org/gsoc.html Thanks, Simon On Thu, Feb 24, 2011 at 12:14 PM, David Nemeskey wrote: > Please find the implementation plan attached. The word "soon" gets a new > meaning when power outages are taken into account. :) > > As before, comments are welcome. > > David > > On Tuesday, February 22, 2011 15:22:57 Simon Willnauer wrote: >> I think that is good for now. I should get started on codeawards and >> wrap up our proposals. I hope I can do that this week. >> >> simon >> >> On Tue, Feb 22, 2011 at 3:16 PM, David Nemeskey >> >> wrote: >> > Hey, >> > >> > I have written the proposal. Please let me know if you want more / less >> > of certain parts. Should I upload it somewhere? >> > >> > Implementation plan soon to follow. >> > >> > Sorry for the late reply; I have been rather busy these past few weeks. >> > >> > David >> > >> > On Wednesday, February 02, 2011 10:35:55 Simon Willnauer wrote: >> >> Hey David, >> >> >> >> I saw that you added a tiny line to the GSoC Lucene wiki - thanks for >> >> that. >> >> >> >> On Wed, Feb 2, 2011 at 10:10 AM, David Nemeskey >> >> >> >> wrote: >> >> > Hi guys, >> >> > >> >> > Mark, Robert, Simon: thanks for the support! I really hope we can work >> >> > together this summer (and before that, obviously). >> >> >> >> Same here! >> >> >> >> > According to http://www.google- >> >> > melange.com/document/show/gsoc_program/google/gsoc2011/timeline , >> >> > there's still some time until the application period. So let me use >> >> > this week to finish my PhD research plan, and get back to you next >> >> > week. >> >> > >> >> > I am not really familiar with how the program works, i.e. how detailed >> >> > the application description should be, when mentorship is decided, >> >> > etc. so I guess we will have a lot to talk about. :) >> >> >> >> so from a 10000ft view it work like this: >> >> >> >> 1. Write up a short proposal what your idea is about >> >> 2. make it public! and publish a implementation plan - how you would >> >> want to realize your proposal. If you don't follow that 100% in the >> >> actual impl. don't worry. Its just mean to give us an idea that you >> >> know what you are doing and where you want to go. something like a 1 >> >> A4 rough design doc. >> >> 3. give other people the change to apply for the same suggestion (this >> >> is how it works though) >> >> 4 Let the ASF / us assign one or more possible mentors to it >> >> 5. let us apply for a slot in GSoC (those are limited for organizations) >> >> 6. get accepted >> >> 7. rock it! >> >> >> >> > (Actually, should we move this discussion private?) >> >> >> >> no - we usually do everything in public except of discussion within >> >> the PMC that are meant to be private for legal reasons or similar >> >> things. Lets stick to the mailing list for all communication except >> >> you have something that should clearly not be public. This also give >> >> other contributors a chance to help and get interested in your work!! >> >> >> >> simon >> >> >> >> > David >> >> > >> >> >> Hi David, honestly this sounds fantastic. >> >> >> >> >> >> It would be great to have someone to work with us on this issue! >> >> >> >> >> >> To date, progress is pretty slow-going (minor improvements, cleanups, >> >> >> additional stats here and there)... but we really need all the help >> >> >> we can get, especially from people who have a really good >> >> >> understanding of the various models. >> >> >> >> >> >> In case you are interested, here are some references to discussions >> >> >> about adding more flexibility (with some prototypes etc): >> >> >> http://www.lucidimagination.com/search/document/72787e0e54f798e4/baby >> >> >> _st eps _towards_making_lucene_s_scoring_more_flexible >> >> >> https://issues.apache.org/jira/browse/LUCENE-2392 >> >> >> >> >> >> On Fri, Jan 28, 2011 at 11:32 AM, David Nemeskey >> >> >> >> >> >> wrote: >> >> >> > Hi all, >> >> >> > >> >> >> > I have already sent this mail to Simon Willnauer, and he suggested >> >> >> > me to post it here for discussion. >> >> >> > >> >> >> > I am David Nemeskey, a PhD student at the Eotvos Lorand University, >> >> >> > Budapest, Hungary. I am doing an IR-related research, and we have >> >> >> > considered using Lucene as our search engine. We were quite >> >> >> > satisfied with the speed and ease of use. However, we would like >> >> >> > to experiment with different ranking algorithms, and this is where >> >> >> > problems arise. Lucene only supports the VSM, and unfortunately >> >> >> > the ranking architecture seems to be tailored specifically to its >> >> >> > needs. >> >> >> > >> >> >> > I would be very much interested in revamping the ranking component >> >> >> > as a GSoC project. The following modifications should be doable in >> >> >> > the allocated time frame: >> >> >> > - a new ranking class hierarchy, which is generic enough to allow >> >> >> > easy implementation of new weighting schemes (at least >> >> >> > bag-of-words ones), - addition of state-of-the-art ranking >> >> >> > methods, such as Okapi BM25, proximity and DFR models, >> >> >> > - configuration for ranking selection, with the old method as >> >> >> > default. >> >> >> > >> >> >> > I believe all users of Lucene would profit from such a project. It >> >> >> > would provide the scientific community with an even more useful >> >> >> > research aid, while regular users could benefit from superior >> >> >> > ranking results. >> >> >> > >> >> >> > Please let me know your opinion about this proposal. >> >> > >> >> > --------------------------------------------------------------------- >> >> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org >> >> > For additional commands, e-mail: dev-help@lucene.apache.org >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org >> >> For additional commands, e-mail: dev-help@lucene.apache.org >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org >> > For additional commands, e-mail: dev-help@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: dev-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org