zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andor Molnar <an...@cloudera.com>
Subject Re: Name resolution in StaticHostProvider
Date Tue, 08 May 2018 15:22:12 GMT
"If refactoring is necessary to address the issue, then it should be part
of the PR."

Agreed. I think it is and it also makes things a lot more simpler, so it's
easier to review.

"I'm not sure what kind of confirmation you are after here. Could you
elaborate, please?"

I'm just wondering what could have been the reason for caching host
addresses in StaticHostProvider in the first place.

"The other solution, if I remember enough of it, tried to avoid resolving
unnecessarily, but perhaps the DNS client cache is enough..."

That's exactly my point: what JDK is doing internally is perfectly fine for
us and we don't need extra logic here.

"Could you elaborate on this point, please? What exactly do you mean with
this statement?"

See my previous point. Caching is already done in all DNS servers in the
chain and also there's also caching in JDK already, which means by calling
getByName() you don't necessarily fire a DNS request, only when the TTL is
expired.

Andor




On Tue, May 8, 2018 at 8:12 AM, Flavio Junqueira <fpj@apache.org> wrote:

> Hi Andor,
>
> Thanks for your work on addressing the issue.
>
> > On 8 May 2018, at 16:06, Andor Molnar <andor@cloudera.com> wrote:
> >
> > Hi,
> >
> > Updating this thread, because the PR is still being review on GitHub.
> >
> > So, the reason why I refactored the original behaviour of
> > StaticHostProvider is that I believe that it's trying to do something
> which
> > is not its responsibility.
>
> If refactoring is necessary to address the issue, then it should be part
> of the PR. Otherwise, it is better to refactor in a separate PR.
>
>
> > Please tell me if there's a good historical
> > reason for that.
>
> I'm not sure what kind of confirmation you are after here. Could you
> elaborate, please?
>
> >
> > My approach is giving the user the following to options:
> > 1- Use static IP addresses, if you don't want to deal with DNS resolution
> > at all - we guarantee that no DNS logic will involved in this case at
> all.
>
> Sounds fine.
>
> > 2- Use DNS hostnames if you have a reliable DNS service for resolution
> > (with HA, secondary servers, backups, etc.) - we must use DNS in the
> right
> > way in this case e.g. do NOT cache IP address for a longer period that
> DNS
> > server allows to and re-resolve after TTL expries, because it's mandatory
> > by protocol.
>
> Sounds fine.
>
> >
> > My 2 cents here:
> > - the fix which was originally posted for re-resolution is a workaround
> and
> > doesn't satisfy the requirement for #2,
>
> The other solution, if I remember enough of it, tried to avoid resolving
> unnecessarily, but perhaps the DNS client cache is enough to avoid the
> unnecessary round-trips. This observation actually brings me to the next
> point:
>
> > - the solution is already built-in in JDK and DNS clients in the right
> way
>
> Could you elaborate on this point, please? What exactly do you mean with
> this statement?
>
> > - can't see a reason why we shouldn't use that
> >
> > I checked this in some other projects as well and found very similar
> > approach in hadoop-common's SecurityUtil.java. It has 2 built-in plugins
> > for that:
> > - Standard resolver uses java's built-in getByName().
> > - Qualified resolver still uses getByName(), but adds some logic to avoid
> > incorrect re-resolutions and reverse IP lookups.
> >
> > Please let me know your thoughts.
> >
> > Regards,
> > Andor
> >
> >
> >
> >
> >
> >
> > On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <andor@cloudera.com> wrote:
> >
> >> Hi Abe,
> >>
> >> Unfortunately we haven't got any feedback yet. What do you think of
> >> implementing Option #3?
> >>
> >> Regards,
> >> Andor
> >>
> >>
> >> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <andor@cloudera.com>
> wrote:
> >>
> >>> Did anybody happen to take a quick look by any chance?
> >>>
> >>> I don't want to push this too hard, because I know it's a time
> consuming
> >>> topic to think about, but this is a blocker in 3.5 which has been
> hanging
> >>> around for a while and any feedback would be extremely helpful to
> close it
> >>> quickly.
> >>>
> >>> Thanks,
> >>> Andor
> >>>
> >>>
> >>>
> >>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <andor@cloudera.com>
> >>> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> We need more eyes and brains on the following PR:
> >>>>
> >>>> https://github.com/apache/zookeeper/pull/451
> >>>>
> >>>> I added a comment few days ago about the way we currently do DNS name
> >>>> resolution in this class and a suggestion on how we could simplify
> things a
> >>>> little bit. We talked about it with Abe Fine, but we're a little bit
> unsure
> >>>> and cannot get a conclusion. It would be extremely handy to get more
> >>>> feedback from you.
> >>>>
> >>>> To add some colour to it, let me elaborate on the situation here:
> >>>>
> >>>> In general, the task that StaticHostProvider does is to get a list of
> >>>> potentially unresolved InetSocketAddress objects, resolve them and
> iterate
> >>>> over the resolved objects by calling next() method.
> >>>>
> >>>> *Option #1 (current logic)*
> >>>> - Resolve addresses with getAllByName() which returns a list of IP
> >>>> addresses associated with the address.
> >>>> - Cache all these IP's, shuffle them and iterate over.
> >>>> - If client is unable to connect to an IP, remove all IPs from the
> list
> >>>> which the original servername was resolved to and re-resolve it.
> >>>>
> >>>> *Option #2 (getByName())*
> >>>> - Resolve address with getByName() instead which returns only the
> first
> >>>> IP address of the name,
> >>>> - Do not cache IPs,
> >>>> - Shuffle the *names* and resolve with getByName() *every time* when
> >>>> next() is called,
> >>>> - JDK's built-in caching will prevent name servers from being flooded
> >>>> and will do the re-resolution automatically when cache expires,
> >>>> - Names with multiple IPs will be handled by DNS servers which (if
> >>>> configured properly) return IPs in different order - this is called
> DNS
> >>>> Round Robin -, so getByName() will return different IP on each call.
> >>>>
> >>>> *Options #3*
> >>>> - There's a small problem with option#2: if DNS server is not
> configured
> >>>> properly and handles the round-robin case in a way that it always
> return
> >>>> the IP list in the same order, getByName() will never return the next
> ip,
> >>>> - In order to overcome that, use getAllByName() instead, shuffle the
> >>>> list and return the first IP.
> >>>>
> >>>> All feedback if much appreciated.
> >>>>
> >>>> Thanks,
> >>>> Andor
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message