Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 83907 invoked from network); 9 Apr 2002 18:43:14 -0000 Received: from unknown (HELO nagoya.betaversion.org) (192.18.49.131) by daedalus.apache.org with SMTP; 9 Apr 2002 18:43:14 -0000 Received: (qmail 18948 invoked by uid 97); 9 Apr 2002 18:43:17 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@jakarta.apache.org Received: (qmail 18927 invoked by uid 97); 9 Apr 2002 18:43:16 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 18915 invoked from network); 9 Apr 2002 18:43:15 -0000 Subject: Getting Terms sequentially without using TermEnumeration From: Slavisa Radic To: lucene-user@jakarta.apache.org Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Evolution/1.0.2-5mdk Date: 09 Apr 2002 20:42:45 +0200 Message-Id: <1018377765.5160.61.camel@Konfusious.kanet2.de> Mime-Version: 1.0 X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Hi, I have to build a Terms-Document-Matrix to be able to do some Matrix operations on it. the Matrix should look like (I hope this will be displayed correctly): term1 term2 term3 term4 ... ---------------------------------- Doc1 freq freq freq freq Doc2 freq................... Doc3 ....................... Doc4 ....................... . . . I tried that by using IndexReader.terms() and IndexReader.TermDocs(term) to get all the terms and the document-numbers which contain them. As you can imagine, the TermEnumeration-Object has become to big to fit into memory. Is there an "easy" way to get the Terms one by one (without writing my own IndexReader)? -- To unsubscribe, e-mail: For additional commands, e-mail: