lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maher Martin" <>
Subject RE: Searching an NTFS File Server
Date Wed, 13 Apr 2005 16:51:10 GMT

Does the following concept sound reasonable?

* The Lucene index generator would run under a windows account that has
full read access to all files stored on the NTFS file server. 
* For each file, the following information would have to be extracted: 
  - the contents of the file and, 
  - the access rights of the file (which windows users / groups have
read access to the file) 

All of this information is stored within the Document index.
Once the index is complete and a user logs into the system then the
following would occur:

* The user's access rights would be read from Active Directory (i.e
windows group membership, etc)
* On the submission of a query to Lucene - the user / group access
rights would be appended as required search criteria and Lucene would
filter out all results that the user shouldn't see.

Is this the correct approach to take with Lucene? 
And does anybody know how to extract user/group access right from Files
lying on an NTFS drive? Ok, I know this is more of a general Java
question - but I'd be interested in hearing from anyone who has faced a
similar requirement and how they solved it.

-----Original Message-----
From: Peter Veentjer - Anchor Men [] 
Sent: 13 April 2005 13:56
Subject: RE: Searching an NTFS File Server

You can use a filter with the IndexSearcher so that it removes all
'unwanted' results. 

-----Original Message-----
From: Maher Martin 
Sent: 13 April 2005 12:39
Subject: Searching an NTFS File Server

Hello all,

We're currently evaluating search tools to cover the following

We have an NTFS file server with 2 TB of files (word, excel, pdf, txt,
etc). We would like to index all these files and integrate a simple web
application into our intranet which will allow users to login using
their windows credentials and search through this file index, returning
only hits to files matching the users search criteria, to which the user
has the *necessary rights* to view.

My questions are:

-        How does one generate the list of results, so that the list
contains only entries that the user may view? Should the returned list
of results be post processed to filter out the invalid entries, and if
so how?

-        Is 2 TB of data to large to be handled by Lucene?

-        Has anybody implemented a similar type of application?


Thanks in advance

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message