zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <...@apache.org>
Subject Re: Name resolution in StaticHostProvider
Date Wed, 09 May 2018 15:01:12 GMT
I'm actually now wondering whether we should be using an unchecked exception instead. A lot
of things have changed with exception handling since we wrote this code base initially. An
unchecked exception would actually match better my current mental model of what that signature
should look like.

-Flavio

> On 9 May 2018, at 16:44, Flavio Junqueira <fpj@apache.org> wrote:
> 
> I like the idea of indicating to the application that there is something wrong with the
list of servers so that it has a chance to look into it. With the current code in `ClientCnxn`,
we will log at warn level and hope that someone sees it, but we are not really stopping the
client. Throwing might actually be an improvement as it will output a log message, but I'm
now wondering if we should propagate it all the way to the application. Responding to myself,
one reason for not doing it is that it is not a fatal error unless no server can be resolved.
> 
> -Flavio
> 
>> On 8 May 2018, at 16:06, Andor Molnar <andor@cloudera.com> wrote:
>> 
>> Hi,
>> 
>> Updating this thread, because the PR is still being review on GitHub.
>> 
>> So, the reason why I refactored the original behaviour of
>> StaticHostProvider is that I believe that it's trying to do something which
>> is not its responsibility. Please tell me if there's a good historical
>> reason for that.
>> 
>> My approach is giving the user the following to options:
>> 1- Use static IP addresses, if you don't want to deal with DNS resolution
>> at all - we guarantee that no DNS logic will involved in this case at all.
>> 2- Use DNS hostnames if you have a reliable DNS service for resolution
>> (with HA, secondary servers, backups, etc.) - we must use DNS in the right
>> way in this case e.g. do NOT cache IP address for a longer period that DNS
>> server allows to and re-resolve after TTL expries, because it's mandatory
>> by protocol.
>> 
>> My 2 cents here:
>> - the fix which was originally posted for re-resolution is a workaround and
>> doesn't satisfy the requirement for #2,
>> - the solution is already built-in in JDK and DNS clients in the right way
>> - can't see a reason why we shouldn't use that
>> 
>> I checked this in some other projects as well and found very similar
>> approach in hadoop-common's SecurityUtil.java. It has 2 built-in plugins
>> for that:
>> - Standard resolver uses java's built-in getByName().
>> - Qualified resolver still uses getByName(), but adds some logic to avoid
>> incorrect re-resolutions and reverse IP lookups.
>> 
>> Please let me know your thoughts.
>> 
>> Regards,
>> Andor
>> 
>> 
>> 
>> 
>> 
>> 
>> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <andor@cloudera.com> wrote:
>> 
>>> Hi Abe,
>>> 
>>> Unfortunately we haven't got any feedback yet. What do you think of
>>> implementing Option #3?
>>> 
>>> Regards,
>>> Andor
>>> 
>>> 
>>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <andor@cloudera.com> wrote:
>>> 
>>>> Did anybody happen to take a quick look by any chance?
>>>> 
>>>> I don't want to push this too hard, because I know it's a time consuming
>>>> topic to think about, but this is a blocker in 3.5 which has been hanging
>>>> around for a while and any feedback would be extremely helpful to close it
>>>> quickly.
>>>> 
>>>> Thanks,
>>>> Andor
>>>> 
>>>> 
>>>> 
>>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <andor@cloudera.com>
>>>> wrote:
>>>> 
>>>>> Hi all,
>>>>> 
>>>>> We need more eyes and brains on the following PR:
>>>>> 
>>>>> https://github.com/apache/zookeeper/pull/451
>>>>> 
>>>>> I added a comment few days ago about the way we currently do DNS name
>>>>> resolution in this class and a suggestion on how we could simplify things
a
>>>>> little bit. We talked about it with Abe Fine, but we're a little bit
unsure
>>>>> and cannot get a conclusion. It would be extremely handy to get more
>>>>> feedback from you.
>>>>> 
>>>>> To add some colour to it, let me elaborate on the situation here:
>>>>> 
>>>>> In general, the task that StaticHostProvider does is to get a list of
>>>>> potentially unresolved InetSocketAddress objects, resolve them and iterate
>>>>> over the resolved objects by calling next() method.
>>>>> 
>>>>> *Option #1 (current logic)*
>>>>> - Resolve addresses with getAllByName() which returns a list of IP
>>>>> addresses associated with the address.
>>>>> - Cache all these IP's, shuffle them and iterate over.
>>>>> - If client is unable to connect to an IP, remove all IPs from the list
>>>>> which the original servername was resolved to and re-resolve it.
>>>>> 
>>>>> *Option #2 (getByName())*
>>>>> - Resolve address with getByName() instead which returns only the first
>>>>> IP address of the name,
>>>>> - Do not cache IPs,
>>>>> - Shuffle the *names* and resolve with getByName() *every time* when
>>>>> next() is called,
>>>>> - JDK's built-in caching will prevent name servers from being flooded
>>>>> and will do the re-resolution automatically when cache expires,
>>>>> - Names with multiple IPs will be handled by DNS servers which (if
>>>>> configured properly) return IPs in different order - this is called DNS
>>>>> Round Robin -, so getByName() will return different IP on each call.
>>>>> 
>>>>> *Options #3*
>>>>> - There's a small problem with option#2: if DNS server is not configured
>>>>> properly and handles the round-robin case in a way that it always return
>>>>> the IP list in the same order, getByName() will never return the next
ip,
>>>>> - In order to overcome that, use getAllByName() instead, shuffle the
>>>>> list and return the first IP.
>>>>> 
>>>>> All feedback if much appreciated.
>>>>> 
>>>>> Thanks,
>>>>> Andor
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
> 


Mime
View raw message