Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 21478 invoked from network); 28 Mar 2011 14:30:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 28 Mar 2011 14:30:50 -0000 Received: (qmail 59672 invoked by uid 500); 28 Mar 2011 14:30:48 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 59630 invoked by uid 500); 28 Mar 2011 14:30:48 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 59622 invoked by uid 99); 28 Mar 2011 14:30:48 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Mar 2011 14:30:48 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of robert.newson@gmail.com designates 209.85.214.52 as permitted sender) Received: from [209.85.214.52] (HELO mail-bw0-f52.google.com) (209.85.214.52) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Mar 2011 14:30:42 +0000 Received: by bwj24 with SMTP id 24so4006124bwj.11 for ; Mon, 28 Mar 2011 07:30:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type:content-transfer-encoding; bh=+iDODURssak4JY+44hCI6CRNhQn2FGcYtFiQUse/iEg=; b=qNFgr0ewP22V3lmKeSMr1pzkBKkmhoMw/hRKgVWWCncWzwEzZa3MDist+RuXtPQWvN mQVYsPRlSjyNpUcGVmEoRz0lrq4VoDKw6DRNYk4xWA+nZ4tVx76XGr3pFa3Vckr8mROL iuII446doS2/KzsYZt4MJBLbosfVJCHKQMsEk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=ojPF7kSl+2B1fhIt32dT740j2udbQPGt/XlxXgCOiSWRGD+O0Z8BXpDHFL2SkCdYVO UQbcn5Q2M3pR4XykjFBmoO4YhbOYbnKwn5R1QkpwCyeSwCL0EzMkMhB3F96aMk47i0K+ u8FaZhmwqXSjzXkv5cB9nMxZZk+00E5jhRJqE= MIME-Version: 1.0 Received: by 10.205.24.12 with SMTP id rc12mr2989865bkb.199.1301322622419; Mon, 28 Mar 2011 07:30:22 -0700 (PDT) Received: by 10.204.58.205 with HTTP; Mon, 28 Mar 2011 07:30:22 -0700 (PDT) In-Reply-To: References: <3D13B027-14D1-4EA6-9751-591173ED57D2@rgabostyle.com> <4AEB3F68-AB89-4EED-A953-8FF1BC59CA0D@supercoders.com.au> <9F98111A-5E6B-479B-A770-A69467C980D4@supercoders.com.au> Date: Mon, 28 Mar 2011 15:30:22 +0100 Message-ID: Subject: Re: Full text search - is it coming? If yes, approx when. From: Robert Newson To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org I am a CouchDB committer and author of couchdb-lucene. :) B. On 28 March 2011 10:44, Andrew Stuart (SuperCoders) wrote: > Hi Robert > > "there are no publicly known plans to build a native full-text indexing > feature for CouchDB." > > I don't know who is who around here as yet - are you commenting from insi= de > knowledge or as an end user/developer? > > Thanks > > > On 28/03/2011, at 8:24 PM, Robert Newson wrote: > > I have to dispute "There does not seem to be much understanding that > this could be a killer feature." > > Obviously full-text search is a killer feature, but it's trivially > available now via couchdb-lucene or elasticsearch. > > What people are asking for is native full-text search which, to me, is > essentially asking for an Erlang port of Lucene. We'd love this, but > it's a huge amount of work. Continually asking others to do > significant amounts of work is also wearying. > > To replace a Lucene-based solution and match its quality and breadth > is a huge chunk of work and is only necessary to satisfy people who, > for various reasons, don't want to use Java. > > To answer the original post, there are no publicly known plans to > build a native full-text indexing feature for CouchDB. > > B. > > On 28 March 2011 10:15, Olafur Arason wrote: >> >> There does not seem to be much understanding that this could be a killer >> feature. People are now relying on Lucene which monitors the _changes >> feed. >> >> Cloudant has done it's own implementation which I gather through the >> information they have published makes a view out of all your word, >> they recommend java view because you can then reuse the lexer from >> Lucene. Then I think they are reusing the reader of the view to make >> their query. They have a similar syntax as Lucene for the query interfac= e. >> They are still working on this and I think they don't have that much >> incentive to opensource it right away. But they have in past both >> opensourced there technology like BigCouch so I think it's more a >> matter of when rather then if. >> >> I think this is a good solution for a fulltext search. But I don't think >> that >> the java view does not have direct access to the data so it could be >> slow. But cloudant does clustering on view generation so that helps. >> >> But there is also general problem with the current view system where >> search technology could be used. >> >> The view are really good at sorting but people are using them to >> do key matches which they are not designed for. They beginkey and >> endkey are for sorting ranges and are not good for matching which >> most resources online are pointing to. >> >> For example when you do: >> beginkey =3D ["key11", "key21"] >> endkey =3D ["key19", "key21"] >> >> You get ["key11","key22"], ["key11", "key23"] ... ["key12","key21"], >> ["key12","key22"]... >> which makes sense when looking up sorting ranges but not using it to >> match keys. But you can have a range match lookup but only on the >> last key and never on two keys. So this would work: >> >> beginkey =3D ["key21", "key11"] >> endkey =3D ["key21", "key19"] >> >> The current view interface could be augmented to accept queries >> and could make them much more powerful then they currently are >> and just using the keys for sorting and selecting which values you >> want shown which they are designed to do and do really well. >> >> This would be a killer feature and could use the new infrastructure >> from Cloudant search. >> >> And don't tell me the Elastic or Lucene interface could do anything >> close to this :) >> >> Regards, >> Olafur Arason >> >> On Mon, Mar 28, 2011 at 04:31, Andrew Stuart (SuperCoders) >> wrote: >>> >>> It would be good to know if full text search is coming as a core featur= e >>> and >>> if yes, approximately when - does anyone know? >>> >>> Even an approximate timeframe would be good. >>> >>> thanks >>> >> > -- > Message =A0protected by MailGuard: e-mail anti-virus, anti-spam and conte= nt > filtering.http://www.mailguard.com.au/mg > Click here to report this message as spam: > https://login.mailguard.com.au/report/1BZveI1wri/4izG2DWUCf9OUvbAh9DkfT/0 >