zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <...@apache.org>
Subject Re: Name resolution in StaticHostProvider
Date Tue, 08 May 2018 20:55:39 GMT
The refactoring did not seem justifiable at first, so my reaction to it. You have clarified
the reason for including the changes, and I actually like it.

About the exception, there are two points for me:

1- You don't really need to throw to execute the plan you described.
2- In the case we do throw, which is not entirely unreasonable, I'd think about the expected
behaviour of a method called "next()". In general, I'd expect it to either return me the next
item or error saying that it cannot return an item. The "UnknownHostException" is not doing
the latter, though. It is indicating that one of the elements of the host provider set is
"broken" (not resolvable). That's one interpretation. Another interpretation is that the logic
in "next" needs to indicate to the caller that there is a problem with an item in the set
of elements of the host provider. 

For (2) let's converge on what semantics we are trying to provide first, please.

-Flavio

> On 8 May 2018, at 21:20, Andor Molnar <andor@cloudera.com> wrote:
> 
> Sorry, I thought you were against the whole refactoring.
> 
> "2- That we discuss separately whether we want to change the behaviour of
> the next()in the HostProvider interface."
> 
> From this it seemed to me, it's not just a polishing issue, but maybe I've
> gotten you wrong.
> Anyway, there're 2 contention points we've encountered so far:
> 
> 1. Do we need to refactor at all?
> 2. If we do so, shall next() throw UnknownHostException or deal with error
> inside.
> 
> And I'd still go with:
> 1. Yes
> 2. Yes, throw
> 
> So, I would leave the PR as it is now.
> 
> Andor
> 
> 
> 
> 
> 
> On Tue, May 8, 2018 at 12:11 PM, Flavio Junqueira <fpj@apache.org> wrote:
> 
>> Can you list what the contention points are according to you? Feel free to
>> do that in the PR as well, I want to make sure we have the same
>> understanding of the points that still need to be resolved. From where I
>> stand, there is no major issue pending other than one polishing issue that
>> I brought upon in the PR.
>> 
>> -Flavio
>> 
>>> On 8 May 2018, at 21:08, Andor Molnar <andor@cloudera.com> wrote:
>>> 
>>> I'm happy to do that once we have an agreement.
>>> 
>>> 
>>> 
>>> 
>>> On Tue, May 8, 2018 at 8:34 AM, Flavio Junqueira <fpj@apache.org> wrote:
>>> 
>>>> It might be a good idea to document whatever we end up doing.
>>>> 
>>>> -Flavio
>>>> 
>>>>> On 8 May 2018, at 17:22, Andor Molnar <andor@cloudera.com> wrote:
>>>>> 
>>>>> "If refactoring is necessary to address the issue, then it should be
>> part
>>>>> of the PR."
>>>>> 
>>>>> Agreed. I think it is and it also makes things a lot more simpler, so
>>>> it's
>>>>> easier to review.
>>>>> 
>>>>> "I'm not sure what kind of confirmation you are after here. Could you
>>>>> elaborate, please?"
>>>>> 
>>>>> I'm just wondering what could have been the reason for caching host
>>>>> addresses in StaticHostProvider in the first place.
>>>>> 
>>>>> "The other solution, if I remember enough of it, tried to avoid
>> resolving
>>>>> unnecessarily, but perhaps the DNS client cache is enough..."
>>>>> 
>>>>> That's exactly my point: what JDK is doing internally is perfectly fine
>>>> for
>>>>> us and we don't need extra logic here.
>>>>> 
>>>>> "Could you elaborate on this point, please? What exactly do you mean
>> with
>>>>> this statement?"
>>>>> 
>>>>> See my previous point. Caching is already done in all DNS servers in
>> the
>>>>> chain and also there's also caching in JDK already, which means by
>>>> calling
>>>>> getByName() you don't necessarily fire a DNS request, only when the TTL
>>>> is
>>>>> expired.
>>>>> 
>>>>> Andor
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Tue, May 8, 2018 at 8:12 AM, Flavio Junqueira <fpj@apache.org>
>> wrote:
>>>>> 
>>>>>> Hi Andor,
>>>>>> 
>>>>>> Thanks for your work on addressing the issue.
>>>>>> 
>>>>>>> On 8 May 2018, at 16:06, Andor Molnar <andor@cloudera.com>
wrote:
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Updating this thread, because the PR is still being review on
GitHub.
>>>>>>> 
>>>>>>> So, the reason why I refactored the original behaviour of
>>>>>>> StaticHostProvider is that I believe that it's trying to do something
>>>>>> which
>>>>>>> is not its responsibility.
>>>>>> 
>>>>>> If refactoring is necessary to address the issue, then it should
be
>> part
>>>>>> of the PR. Otherwise, it is better to refactor in a separate PR.
>>>>>> 
>>>>>> 
>>>>>>> Please tell me if there's a good historical
>>>>>>> reason for that.
>>>>>> 
>>>>>> I'm not sure what kind of confirmation you are after here. Could
you
>>>>>> elaborate, please?
>>>>>> 
>>>>>>> 
>>>>>>> My approach is giving the user the following to options:
>>>>>>> 1- Use static IP addresses, if you don't want to deal with DNS
>>>> resolution
>>>>>>> at all - we guarantee that no DNS logic will involved in this
case at
>>>>>> all.
>>>>>> 
>>>>>> Sounds fine.
>>>>>> 
>>>>>>> 2- Use DNS hostnames if you have a reliable DNS service for
>> resolution
>>>>>>> (with HA, secondary servers, backups, etc.) - we must use DNS
in the
>>>>>> right
>>>>>>> way in this case e.g. do NOT cache IP address for a longer period
>> that
>>>>>> DNS
>>>>>>> server allows to and re-resolve after TTL expries, because it's
>>>> mandatory
>>>>>>> by protocol.
>>>>>> 
>>>>>> Sounds fine.
>>>>>> 
>>>>>>> 
>>>>>>> My 2 cents here:
>>>>>>> - the fix which was originally posted for re-resolution is a
>> workaround
>>>>>> and
>>>>>>> doesn't satisfy the requirement for #2,
>>>>>> 
>>>>>> The other solution, if I remember enough of it, tried to avoid
>> resolving
>>>>>> unnecessarily, but perhaps the DNS client cache is enough to avoid
the
>>>>>> unnecessary round-trips. This observation actually brings me to the
>> next
>>>>>> point:
>>>>>> 
>>>>>>> - the solution is already built-in in JDK and DNS clients in
the
>> right
>>>>>> way
>>>>>> 
>>>>>> Could you elaborate on this point, please? What exactly do you mean
>> with
>>>>>> this statement?
>>>>>> 
>>>>>>> - can't see a reason why we shouldn't use that
>>>>>>> 
>>>>>>> I checked this in some other projects as well and found very
similar
>>>>>>> approach in hadoop-common's SecurityUtil.java. It has 2 built-in
>>>> plugins
>>>>>>> for that:
>>>>>>> - Standard resolver uses java's built-in getByName().
>>>>>>> - Qualified resolver still uses getByName(), but adds some logic
to
>>>> avoid
>>>>>>> incorrect re-resolutions and reverse IP lookups.
>>>>>>> 
>>>>>>> Please let me know your thoughts.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Andor
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <andor@cloudera.com>
>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Abe,
>>>>>>>> 
>>>>>>>> Unfortunately we haven't got any feedback yet. What do you
think of
>>>>>>>> implementing Option #3?
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Andor
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <andor@cloudera.com>
>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Did anybody happen to take a quick look by any chance?
>>>>>>>>> 
>>>>>>>>> I don't want to push this too hard, because I know it's
a time
>>>>>> consuming
>>>>>>>>> topic to think about, but this is a blocker in 3.5 which
has been
>>>>>> hanging
>>>>>>>>> around for a while and any feedback would be extremely
helpful to
>>>>>> close it
>>>>>>>>> quickly.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Andor
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <andor@cloudera.com
>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi all,
>>>>>>>>>> 
>>>>>>>>>> We need more eyes and brains on the following PR:
>>>>>>>>>> 
>>>>>>>>>> https://github.com/apache/zookeeper/pull/451
>>>>>>>>>> 
>>>>>>>>>> I added a comment few days ago about the way we currently
do DNS
>>>> name
>>>>>>>>>> resolution in this class and a suggestion on how
we could simplify
>>>>>> things a
>>>>>>>>>> little bit. We talked about it with Abe Fine, but
we're a little
>> bit
>>>>>> unsure
>>>>>>>>>> and cannot get a conclusion. It would be extremely
handy to get
>> more
>>>>>>>>>> feedback from you.
>>>>>>>>>> 
>>>>>>>>>> To add some colour to it, let me elaborate on the
situation here:
>>>>>>>>>> 
>>>>>>>>>> In general, the task that StaticHostProvider does
is to get a list
>>>> of
>>>>>>>>>> potentially unresolved InetSocketAddress objects,
resolve them and
>>>>>> iterate
>>>>>>>>>> over the resolved objects by calling next() method.
>>>>>>>>>> 
>>>>>>>>>> *Option #1 (current logic)*
>>>>>>>>>> - Resolve addresses with getAllByName() which returns
a list of IP
>>>>>>>>>> addresses associated with the address.
>>>>>>>>>> - Cache all these IP's, shuffle them and iterate
over.
>>>>>>>>>> - If client is unable to connect to an IP, remove
all IPs from the
>>>>>> list
>>>>>>>>>> which the original servername was resolved to and
re-resolve it.
>>>>>>>>>> 
>>>>>>>>>> *Option #2 (getByName())*
>>>>>>>>>> - Resolve address with getByName() instead which
returns only the
>>>>>> first
>>>>>>>>>> IP address of the name,
>>>>>>>>>> - Do not cache IPs,
>>>>>>>>>> - Shuffle the *names* and resolve with getByName()
*every time*
>> when
>>>>>>>>>> next() is called,
>>>>>>>>>> - JDK's built-in caching will prevent name servers
from being
>>>> flooded
>>>>>>>>>> and will do the re-resolution automatically when
cache expires,
>>>>>>>>>> - Names with multiple IPs will be handled by DNS
servers which (if
>>>>>>>>>> configured properly) return IPs in different order
- this is
>> called
>>>>>> DNS
>>>>>>>>>> Round Robin -, so getByName() will return different
IP on each
>> call.
>>>>>>>>>> 
>>>>>>>>>> *Options #3*
>>>>>>>>>> - There's a small problem with option#2: if DNS server
is not
>>>>>> configured
>>>>>>>>>> properly and handles the round-robin case in a way
that it always
>>>>>> return
>>>>>>>>>> the IP list in the same order, getByName() will never
return the
>>>> next
>>>>>> ip,
>>>>>>>>>> - In order to overcome that, use getAllByName() instead,
shuffle
>> the
>>>>>>>>>> list and return the first IP.
>>>>>>>>>> 
>>>>>>>>>> All feedback if much appreciated.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Andor
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Mime
View raw message