directory-api mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lécharny <elecha...@gmail.com>
Subject Re: Proper use of LdapConnectionPool
Date Tue, 27 Jan 2015 23:42:41 GMT
Le 27/01/15 23:07, Harris, Christopher P a écrit :
> Hi, Emmanuel.
>
> "Can you tell us how you do that ? Ie, are you using a plain new connection for each
thread you spawn ?"
> Sure.  I can tell you how I am implementing a multi-threaded approach to read all of
LDAP/AD into memory.  I'll do the next best thing...paste my code at the end of my response.
>
>
> "In any case, the TimeOut is the default LDapConnection timeout (30 seconds) :"
> Yes, I noticed mention of the default timeout in your User Guide.
>
>
> "You have to set the LdapConnectionConfig timeout for all the created connections to
use it. there is a setTimeout() method for that which has been added in 1.0.0-M28."
> When visiting your site while seeking to explore connection pool options, I noticed that
you recently released M28 and fixed DIRAPI-217 and decided to update my pom.xml to M28 and
test out the PoolableLdapConnectionFactory.  Great job, btw.  Keep up the good work!
>
> Oh, and your example needs to be updated to using DefaultPoolableLdapConnectionFactory
instead of PoolableLdapConnectionFactory.
>
>
> "config.setTimeOut( whatever fits you );"
> Very good to know.  Thank you!
>
>
> "It is the right way."
> Sweeeeeeet!
>
>
> "Side note : you may face various problems when pulling everything from an AD server.
Typically, the AD config might not let you pull more than
> 1000 entries, as there is a hard limit you need to change on AD if you want to get more
entries.
>
> Otherwise, the approach - ie, using multiple threads - might seems good, but the benefit
is limited. Pulling entries from the server is fast, you should be able to get tens of thousands
per second with one single thread. I'm not sure how AD support concurrent searches anyway.
Last, not least, it's likely that AD does not allow more than a certain number of concurrent
threads to run, which might lead to contention at some point."
>
> Ah, this is why I wanted to reach out to you guys.  You guys know this kind of in-depth
information about LDAP and AD.  So, I may adapt my code to a single-thread then.  I can live
with that.  I need to pull about 40k-60k entries, so 10's of thousands of entries per second
works for me.  I may need to run the code by you then if I go with a single-threaded approach
and need to check if I'm going about it in the most efficient manner.

The pb with the multi-threaded approach is that you *have* to know which
entry has children, because they won't give you such an info. So you
will end doing a search for every single entry you get at one level,
with scope ONE_LEVEL, and most of the time, you will just get teh entry
itself. That would more than double the time it takes to grab everything.

>
>
>
> And now time for some code...
>
> import java.io.IOException;
> import java.util.Iterator;
> import java.util.List;
> import java.util.Map;
> import java.util.concurrent.ConcurrentHashMap;
> import java.util.concurrent.ExecutorService;
> import java.util.concurrent.Executors;
> import java.util.concurrent.TimeUnit;
> import java.util.logging.Level;
> import java.util.logging.Logger;
>
> import org.apache.commons.pool.impl.GenericObjectPool;
> import org.apache.directory.api.ldap.model.cursor.CursorException;
> import org.apache.directory.api.ldap.model.cursor.SearchCursor;
> import org.apache.directory.api.ldap.model.entry.Entry;
> import org.apache.directory.api.ldap.model.exception.LdapException;
> import org.apache.directory.api.ldap.model.message.Response;
> import org.apache.directory.api.ldap.model.message.SearchRequest;
> import org.apache.directory.api.ldap.model.message.SearchRequestImpl;
> import org.apache.directory.api.ldap.model.message.SearchResultEntry;
> import org.apache.directory.api.ldap.model.message.SearchScope;
> import org.apache.directory.api.ldap.model.name.Dn;
> import org.apache.directory.ldap.client.api.DefaultLdapConnectionFactory;
> import org.apache.directory.ldap.client.api.LdapConnection;
> import org.apache.directory.ldap.client.api.LdapConnectionConfig;
> import org.apache.directory.ldap.client.api.LdapConnectionPool;
> import org.apache.directory.ldap.client.api.LdapNetworkConnection;
> import org.apache.directory.ldap.client.api.DefaultPoolableLdapConnectionFactory;
> import org.apache.directory.ldap.client.api.ValidatingPoolableLdapConnectionFactory;
> import org.apache.directory.ldap.client.api.SearchCursorImpl;
> import org.apache.directory.ldap.client.template.EntryMapper;
> import org.apache.directory.ldap.client.template.LdapConnectionTemplate;
>
> /**
>  * @author Chris Harris
>  *
>  */
> public class LdapClient {
> 		
> 	public LdapClient() {
> 		
> 	}
> 			
> 	public Person searchLdapForCeo() {
> 		return this.searchLdapUsingHybridApproach(ceoQuery);
> 	}
> 	
> 	public Map<String, Person> buildLdapMap() {
> 		SearchCursor cursor = new SearchCursorImpl(null, 300000, TimeUnit.SECONDS);
> 		LdapConnection connection = new LdapNetworkConnection(host, port);
> 		connection.setTimeOut(300000);
> 		Entry entry = null;
> 		
> 		try {
> 			connection.bind(dn, pwd);
>             			LdapClient.recursivelyGetLdapDirectReports(connection, cursor, entry,
ceoQuery);
>             			System.out.println("Finished all Ldap Map Builder threads...");
>         		} catch (LdapException ex) {
>             			Logger.getLogger(LdapClient.class.getName()).log(Level.SEVERE, null, ex);
>         		} catch (CursorException ex) {
>             			Logger.getLogger(LdapClient.class.getName()).log(Level.SEVERE, null, ex);
>         		} finally {
>             			cursor.close();
>            			 try {
>                 			connection.close();
>             			} catch (IOException ex) {
>                 			Logger.getLogger(LdapClient.class.getName()).log(Level.SEVERE, null,
ex);
>             			}
>         		}
> 		
> 		return concurrentPersonMap;
> 	}
> 	
> 	private static Person recursivelyGetLdapDirectReports(LdapConnection connection, SearchCursor
cursor, Entry entry, String query) 
> 			throws CursorException {
> 		Person p = null;
>         		EntryMapper<Person> em = Person.getEntryMapper();
>         
> 		try {	        
> 	        		SearchRequest sr = new SearchRequestImpl();
> 	        		sr.setBase(new Dn(searchBase));
> 	        		StringBuilder sb = new StringBuilder(query);
> 	        		sr.setFilter(sb.toString());
> 	        		sr.setScope( SearchScope.SUBTREE );

Ahhhhh !!!! STOP !!!

Ok, no need to go any further in your code.

You are doing a SUBTREE search on *every single entry* you are pulling
from the base. if you have 40 000 entries, you will do something like O(
40 000! ) (factorial) searches. No wonder why you get timeout... Imagine
you have such a tree :

root
  A1
    B1
      C1
      C2
    B2
      C3
      C4
  A2
    B3
      C5
      C6
    B4
      C7
      C8

The search on root with pull A1, A2, B1, B2, B3, B4, C1..8 (14 entries
-> 14 searches)
Then the search on A1 will pull B1, C1, C2, B2, C3, C4 (6 entries -> 6
searches)
Then the search on A2 will pull B3, C5, C6, B7, C8, C9 (6 entries -> 6
searches)
Then the search on B1 will pull C1, C2 ( 2 entries -> 2 searches, *4 = 8
...

At the end, you have done 1 + 14 + 12 + 8 = 35 searches, when you have
only 15 entries...

If you want to see what your algorithm is doing, just do a search using
a SearchScope.ONE_LEVEL instead. You will only do somehow O(40 000)
searches, which is way less than what you are doing.

But anyway, doing a search on the root with a SUBTREE scope will be way
faster, because you will do only one single search.



Mime
View raw message