Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 32637 invoked from network); 8 Jan 2011 18:25:04 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 8 Jan 2011 18:25:04 -0000 Received: (qmail 53172 invoked by uid 500); 8 Jan 2011 18:25:02 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 53123 invoked by uid 500); 8 Jan 2011 18:25:02 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 53115 invoked by uid 99); 8 Jan 2011 18:25:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 08 Jan 2011 18:25:02 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of vallalku@gmail.com designates 209.85.216.176 as permitted sender) Received: from [209.85.216.176] (HELO mail-qy0-f176.google.com) (209.85.216.176) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 08 Jan 2011 18:24:57 +0000 Received: by qyk10 with SMTP id 10so19168749qyk.14 for ; Sat, 08 Jan 2011 10:24:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=LrzBayFp7B0JWMUZ56W7WjyBKTfXLHJ9fpUnMd8qMkw=; b=erofAvmkWiBrvCGuZKanpgl0uNigi/AHdOC7Cje9ilnj/+QlEk+U4OWdvjsUK6PJhF 6AUtNqE4dpIFJfCcoDhfrTZJIJ/i28jeLuH9PBkYugkxsTZBUjakM3AG4I65ozL+Qrbb oaDGFZLuvS5Pnerli9lfsOhWeftWBFvHeOk44= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=kHmYNH9rfyCwZpRd5lsVp22pY5g9KxyVfRBX9JToAETKAIBdSH/AO7Qk9DwlbkKf2T i0HOvv5hrKfGDx16ekIJv6gkEBEB3vnnhVA/l7KF1DAyQt1rLUHk7O7rnC2xxTmFOKDu djIsETEfT/AqN4BwoLR3cDBK1UhyztwlBVkwQ= MIME-Version: 1.0 Received: by 10.229.241.196 with SMTP id lf4mr22880745qcb.284.1294511076612; Sat, 08 Jan 2011 10:24:36 -0800 (PST) Received: by 10.229.232.11 with HTTP; Sat, 8 Jan 2011 10:24:36 -0800 (PST) In-Reply-To: References: <4D279DA9.70506@fissore.org> Date: Sat, 8 Jan 2011 10:24:36 -0800 Message-ID: Subject: Re: is OpenBitSet / SortedVIntList compressed bit map index? From: Raavan To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=001485f92356e11df6049959d80a --001485f92356e11df6049959d80a Content-Type: text/plain; charset=ISO-8859-1 Also, just for my understanding, is SortedVIntList able to perform some operations such as AND/OR without decompression ? Some of the algorithms mentioned below claim to do that. But I understand that there are patent issues surrounding these algorithms. http://en.wikipedia.org/wiki/Bitmap_index -Raavan On Sat, Jan 8, 2011 at 10:14 AM, Raavan wrote: > Thanks Federico. > > >> my primary concern at the moment is serializing bitsets to recover > searcher warmup time > > I am also considering doing the same to reduce warmup time during restarts. > > It seems one of the disadvantages of SortedVIntList is the performance > skipTo() as per Paul Elschot since it does not support random access like > OpenBitSet. > https://issues.apache.org/jira/browse/LUCENE-1296 > > Our primary concern is memory usage since we have hundreds of filters and > large number of documents. So if the performance is decent, I am thinking of > using SortedVIntList for all our sparse filters. > > -Raavan > > > On Fri, Jan 7, 2011 at 3:11 PM, Federico Fissore wrote: > >> First Last, il 07/01/2011 20:55, ha scritto: >> >> Hi, >>> >>> is OpenBitSet / SortedVIntList a compressed bit map index? Which one is >>> better if memory usage is the primary concern ? >>> >>> >> SortedVIntList is compressed, OpenBitSet is not >> >> >> >> Our filters are sparse. So is SortedVIntList better in that case? >>> >>> >> Yes >> >> >> >> Are there any other compressed bitmap index implementations which offer >>> bit >>> map compression at a decent performance assuming filters are sparse? >>> >>> >> I'm too looking for alternative implementations of compressed bitsets, so >> I'm too really interested in everybody experience: my primary concern at the >> moment is serializing bitsets to recover searcher warmup time >> >> I've tried some and roughly tested them: my conclusion was that we (lucene >> users) already stand on the rolls royce of bitset implementations. >> >> federico >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> > --001485f92356e11df6049959d80a--