From java-user-return-52828-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Fri May 18 21:55:49 2012 Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6FFA49C89 for ; Fri, 18 May 2012 21:55:49 +0000 (UTC) Received: (qmail 26930 invoked by uid 500); 18 May 2012 21:55:47 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 26850 invoked by uid 500); 18 May 2012 21:55:47 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 26842 invoked by uid 99); 18 May 2012 21:55:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 May 2012 21:55:47 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of xekoukou@gmail.com designates 209.85.214.176 as permitted sender) Received: from [209.85.214.176] (HELO mail-ob0-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 May 2012 21:55:40 +0000 Received: by obbef5 with SMTP id ef5so6609823obb.35 for ; Fri, 18 May 2012 14:55:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=uZ65F1pjn4Ko1ggsYsl4JBY4XBJKCEUyfmE45Fx5ZLo=; b=qFN/eZn7JWOSngzCfYbkzU5LkZG+BspHb8smomm4aYTZcE4lxPXp/tNsf3ulri1bJn H6zN22uLTTM0v3sEaP95BAjgpzIHM9rj7uBYLZepVZWQ/u3YnzuVEiVrt9QyTe90uvfu 2VyyDjIP4W4AvceMswRPzBbEhdShQMrW7krX14C6Pc0zYSsH3ZqLnTqjoVb7ubMRWVND hxHSXoBCTsvr0h0f/O4lGcrzFwy2A85F7aQVdv4dtQcUBvFU5fXsG4269qdbFPog9O4F 5BYIiZ7CyN4lDePYW1s01PK7E6PaVMe4TfVR2g3kwceubfetZwmNHlV3frsS/76KDYC0 pIMA== MIME-Version: 1.0 Received: by 10.182.74.4 with SMTP id p4mr11822480obv.79.1337378120060; Fri, 18 May 2012 14:55:20 -0700 (PDT) Received: by 10.60.0.137 with HTTP; Fri, 18 May 2012 14:55:20 -0700 (PDT) Date: Sat, 19 May 2012 00:55:20 +0300 Message-ID: Subject: Per User filtering of public/common documents From: Apostolis Xekoukoulotakis To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=f46d04451829c6ce5704c0569b48 X-Virus-Checked: Checked by ClamAV on apache.org --f46d04451829c6ce5704c0569b48 Content-Type: text/plain; charset=ISO-8859-1 Let us say that we have N users that care about K of the M common documents that exist in a database. What is the best way to filter the documents? The results will then be sorted per properties of the document,properties that are stored in a database.(multidimensional score/sorting). Then the top D^(number of properties) results can be extracted to be shown in the users screen. For this to work, all hits need to collected from Lucene. (One of the properties is ofcourse relevance which is extracted from lucene) (The other 'properties/ranking' of the documents will change a lot despite the document remaining static.) What is the fastest way to do what I want? Can you explain your answer on the algorithmic complexity of the internals of lucene so as that I understand lucene? I have heard that collecting all documents is time consuming. Why is that? Arent all documents that match the terms of the query sorted by relevance despite the fact that only n of them are selected? Some random thoughts/solutions: In a new field, add to each document the name of the users that want to see it, then pass the name in the query. Create and store a bitmap per user. problem:the bitmap will change a lot since it depends on the properties that change dynamically. Too many questions, sorry for that. -- Sincerely yours, Apostolis Xekoukoulotakis --f46d04451829c6ce5704c0569b48--