Return-Path: Delivered-To: apmail-lucene-solr-dev-archive@minotaur.apache.org Received: (qmail 60677 invoked from network); 30 May 2009 18:01:44 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 30 May 2009 18:01:44 -0000 Received: (qmail 17041 invoked by uid 500); 30 May 2009 18:01:56 -0000 Delivered-To: apmail-lucene-solr-dev-archive@lucene.apache.org Received: (qmail 16967 invoked by uid 500); 30 May 2009 18:01:56 -0000 Mailing-List: contact solr-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-dev@lucene.apache.org Delivered-To: mailing list solr-dev@lucene.apache.org Received: (qmail 16957 invoked by uid 99); 30 May 2009 18:01:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 May 2009 18:01:56 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jimmoefoe@gmail.com designates 209.85.221.112 as permitted sender) Received: from [209.85.221.112] (HELO mail-qy0-f112.google.com) (209.85.221.112) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 May 2009 18:01:45 +0000 Received: by qyk10 with SMTP id 10so3759437qyk.29 for ; Sat, 30 May 2009 11:01:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=2yzgHrYKk1EKca75Sw9jgjEGXTr5vIJW56OGMovdMWM=; b=doDtCCmcOFp3Yq4Yu8QfmEOSYGgC7SX9ItId6Y/07Kw3d21hZjiYKCv+mCLJ+tkPWm xCF6uJlzHFy05IOIEcZxyVp/ICHYzCnjCQPE9pL/zTDRWEKBsUGw48l8B0SnA6Ao9a/X ZMWtRs4xTaBlPTbgSRULPLE6WNz86MetvByFU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=R4yuPRLBUp0/XJuqTnhsXReKIXoEYgeFF+6We4cDtLBQ8d3S6tGKY04NO9HP30BhhE r2DXqzTicUJlVEczB16qcUJS6aappi5sYy342z4x3Ip5Br5Y29l/Lyg/OC3ZY9vxFfjh AWBO3VF5aZpX7r/NPYxLI15IXkZaUXCYS2Wo8= MIME-Version: 1.0 Received: by 10.229.96.132 with SMTP id h4mr1315552qcn.65.1243706484016; Sat, 30 May 2009 11:01:24 -0700 (PDT) In-Reply-To: <41A01785-B210-438C-AA6E-A39F33EBB555@dfeatherston.com> References: <57EDBA85-A4F0-4D79-BA07-E7083871CE7E@apache.org> <41A01785-B210-438C-AA6E-A39F33EBB555@dfeatherston.com> Date: Sat, 30 May 2009 11:01:23 -0700 Message-ID: <142154980905301101o13b158abo2e2dd6d3a4479e63@mail.gmail.com> Subject: Re: Streaming Docs, Terms, TermVectors From: Kaktu Chakarabati To: solr-dev@lucene.apache.org Content-Type: multipart/alternative; boundary=0016364273772f466b046b24fb61 X-Virus-Checked: Checked by ClamAV on apache.org --0016364273772f466b046b24fb61 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit For a streaming-like solution, it is possible infact to have a working buffer in-memory that emits chunks on an http connection which is kept alive by the server until the full response has been sent. This is quite similar for example to how video streaming protocols which can operate on top of HTTP work ( cf. a more general discussion on http://ajaxpatterns.org/HTTP_Streaming#In_A_Blink ). Another (non-mutually exclusive) possibility is to introduce a novel binary format for the transmission of such data ( i.e a new wt=<..> type ) over http (or any other comm. protocol) so that data can be more effectively compressed and made to better fit into memory. One such format which has been widely circulating and already has many open source projects implementing it is Adobe's AMF ( http://osflash.org/documentation/amf ). It is however a proprietary format so i'm not sure whether it is incorporable under apache foundation terms. -Chak On Sat, May 30, 2009 at 9:58 AM, Dietrich Featherston wrote: > I was actually curious about the same thing. Perhaps an endpoint reference > could be passed in the request where the documents can be sent > asynchronously, such as a jms topic. > > solr/query?q=*:*&epr=/my/topic&eprtype=jms > > Then we would need to consider how to break up the response, how to cancel > a running query, etc. > > Is this along the lines of what you're looking for? I would be interested > in looking at how the request/response contract changes and what types of > endpoint references would be supported. > > Thanks, > D > > > > > > > On May 30, 2009, at 12:45 PM, Grant Ingersoll wrote: > > Anyone have any thoughts on what is involved with streaming lots of >> results out of Solr? >> >> For instance, if I wanted to get something like 1M docs out of Solr (or >> more) via *:* query, how can I tractably do this? Likewise, if I wanted to >> return all the terms in the index or all the Term Vectors. >> >> Obviously, it is impossible to load all of these things into memory and >> then create a response, so I was wondering if anyone had any ideas on how to >> stream them. >> >> Thanks, >> Grant >> > --0016364273772f466b046b24fb61--