Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@www.apache.org Received: (qmail 12821 invoked from network); 23 Oct 2003 02:38:59 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 23 Oct 2003 02:38:59 -0000 Received: (qmail 37042 invoked by uid 500); 23 Oct 2003 02:38:38 -0000 Delivered-To: apmail-jakarta-lucene-dev-archive@jakarta.apache.org Received: (qmail 37011 invoked by uid 500); 23 Oct 2003 02:38:38 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 36998 invoked from network); 23 Oct 2003 02:38:38 -0000 Received: from unknown (HELO smtp03.mrf.mail.rcn.net) (207.172.4.62) by daedalus.apache.org with SMTP; 23 Oct 2003 02:38:38 -0000 Received: from 207-237-23-51.c3-0.avec-ubr11.nyr-avec.ny.cable.rcn.com ([207.237.23.51] helo=localhost) by smtp03.mrf.mail.rcn.net with esmtp (Exim 3.35 #4) id 1ACVN9-0001wr-00 for lucene-dev@jakarta.apache.org; Wed, 22 Oct 2003 22:38:47 -0400 Received: from formicary.net (localhost [127.0.0.1]) by localhost (Postfix) with ESMTP id DCFEC69D45B for ; Wed, 22 Oct 2003 22:37:58 -0400 (EDT) Date: Wed, 22 Oct 2003 22:37:58 -0400 Mime-Version: 1.0 (Apple Message framework v552) Content-Type: text/plain; charset=US-ASCII; format=flowed Subject: timestamp timings From: Hani Suleiman To: lucene-dev@jakarta.apache.org Content-Transfer-Encoding: 7bit Message-Id: X-Mailer: Apple Mail (2.552) X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N I've run this test and listing files is pretty consistently faster than opening and reading a file. The list approach becomes slower than the file when there are around 20+ files in the directory, so if a solution whereby timestamps are put in a dedicated dir is put in place, it's consistently faster than reading a file. I'm including my micro benchmark inline, so feel free to run it and improve either time ;) For writing, the stamp-in-name solution is faster (obviously) since we're just creating an empty file, vs creating a file and writing into it. Also I think that holding the file open is probably a bad idea, given that it'd rule out multiple VM's using the same index. public class FileTest { static File dir = new File("testdir"); static class TimestampFileFilter implements FileFilter { private String name; TimestampFileFilter(String name) { this.name = name; } public boolean accept(File pathname) { return pathname.getName().startsWith(name + "."); } } public static long testFileList(String name) { TimestampFileFilter filter = new TimestampFileFilter(name); File[] files = dir.listFiles(filter); long latest = 0; for(int i = 0; i < files.length; i++) { File timestampFile = files[i]; String fileName = timestampFile.getName(); long timestamp = Long.parseLong(fileName.substring(fileName.lastIndexOf('.') + 1)); if(timestamp > latest) latest = timestamp; } return latest; } public static long testReadFile(File file) throws IOException { FileInputStream fis = new FileInputStream(file); DataInputStream dis = new DataInputStream(fis); long timestamp = dis.readLong(); fis.close(); return timestamp; } public static void main(String[] args) throws IOException { dir.mkdir(); File file = new File(dir, "blah.1234"); FileOutputStream fos = new FileOutputStream(new File(dir, "timestamp")); DataOutputStream dos = new DataOutputStream(fos); dos.writeLong(System.currentTimeMillis()); fos.close(); file.createNewFile(); long now = System.currentTimeMillis(); int iterations = 100000; for(int i=0;i