zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andor Molnar <an...@cloudera.com>
Subject Re: Name resolution in StaticHostProvider
Date Thu, 17 May 2018 22:19:18 GMT
Flavio, did you have a chance to think about whether we should use
unchecked exception here?

Andor



On Sat, May 12, 2018 at 1:02 PM, Norbert Kalmar <nkalmar@cloudera.com>
wrote:

> Hi,
>
> In my opinion, when the client calls next(), and sees that this call could
> return with an UnknownHostException, it will assume next() is giving back
> the next available address in the list, whether it's a valid address or
> not.
> If next() would throw something like "NoValidAddressFoundException", then
> it would now it iterated the whole list, and there is no chance there is
> any valid address left.
>
> So the way I see it using checked exception helps here to understand next,
> and to signal it to the caller that it should prepare that there was a
> failed lookup, and do something about it.
>
> I would go with checked exception, but I don't know all that well the
> callers or the mechanic here.
>
> The rules of thumb I follow with checked vs unchecked exception is that "Is
> it reasonable for the caller to recover from this?" If yes, throw a checked
> exception. In this case, if next() only looks one element from the list,
> but there's an error with that address. then I think it is reasonable to
> expect the caller to recover.
>
> But I'm also open to pros and cons :)
>
> Regards,
> Norbert
>
>
>
> On Sat, May 12, 2018 at 4:09 PM Andor Molnar <andor@cloudera.com> wrote:
>
> > Hi team,
> >
> > Anyone else has thoughts about whether we should use checked or unchecked
> > exception here?
> >
> > Thanks,
> > Andor
> >
> >
> >
> >
> > On Thu, May 10, 2018 at 8:19 AM, Andor Molnar <andor@cloudera.com>
> wrote:
> >
> > > Interesting idea. What difference you think it could make comparing to
> > > checked?
> > >
> > >
> > >
> > > On Wed, May 9, 2018 at 8:01 AM, Flavio Junqueira <fpj@apache.org>
> wrote:
> > >
> > >> I'm actually now wondering whether we should be using an unchecked
> > >> exception instead. A lot of things have changed with exception
> handling
> > >> since we wrote this code base initially. An unchecked exception would
> > >> actually match better my current mental model of what that signature
> > should
> > >> look like.
> > >>
> > >> -Flavio
> > >>
> > >> > On 9 May 2018, at 16:44, Flavio Junqueira <fpj@apache.org> wrote:
> > >> >
> > >> > I like the idea of indicating to the application that there is
> > >> something wrong with the list of servers so that it has a chance to
> look
> > >> into it. With the current code in `ClientCnxn`, we will log at warn
> > level
> > >> and hope that someone sees it, but we are not really stopping the
> > client.
> > >> Throwing might actually be an improvement as it will output a log
> > message,
> > >> but I'm now wondering if we should propagate it all the way to the
> > >> application. Responding to myself, one reason for not doing it is that
> > it
> > >> is not a fatal error unless no server can be resolved.
> > >> >
> > >> > -Flavio
> > >> >
> > >> >> On 8 May 2018, at 16:06, Andor Molnar <andor@cloudera.com>
wrote:
> > >> >>
> > >> >> Hi,
> > >> >>
> > >> >> Updating this thread, because the PR is still being review on
> GitHub.
> > >> >>
> > >> >> So, the reason why I refactored the original behaviour of
> > >> >> StaticHostProvider is that I believe that it's trying to do
> something
> > >> which
> > >> >> is not its responsibility. Please tell me if there's a good
> > historical
> > >> >> reason for that.
> > >> >>
> > >> >> My approach is giving the user the following to options:
> > >> >> 1- Use static IP addresses, if you don't want to deal with DNS
> > >> resolution
> > >> >> at all - we guarantee that no DNS logic will involved in this
case
> at
> > >> all.
> > >> >> 2- Use DNS hostnames if you have a reliable DNS service for
> > resolution
> > >> >> (with HA, secondary servers, backups, etc.) - we must use DNS
in
> the
> > >> right
> > >> >> way in this case e.g. do NOT cache IP address for a longer period
> > that
> > >> DNS
> > >> >> server allows to and re-resolve after TTL expries, because it's
> > >> mandatory
> > >> >> by protocol.
> > >> >>
> > >> >> My 2 cents here:
> > >> >> - the fix which was originally posted for re-resolution is a
> > >> workaround and
> > >> >> doesn't satisfy the requirement for #2,
> > >> >> - the solution is already built-in in JDK and DNS clients in the
> > right
> > >> way
> > >> >> - can't see a reason why we shouldn't use that
> > >> >>
> > >> >> I checked this in some other projects as well and found very
> similar
> > >> >> approach in hadoop-common's SecurityUtil.java. It has 2 built-in
> > >> plugins
> > >> >> for that:
> > >> >> - Standard resolver uses java's built-in getByName().
> > >> >> - Qualified resolver still uses getByName(), but adds some logic
to
> > >> avoid
> > >> >> incorrect re-resolutions and reverse IP lookups.
> > >> >>
> > >> >> Please let me know your thoughts.
> > >> >>
> > >> >> Regards,
> > >> >> Andor
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar <andor@cloudera.com>
> > >> wrote:
> > >> >>
> > >> >>> Hi Abe,
> > >> >>>
> > >> >>> Unfortunately we haven't got any feedback yet. What do you
think
> of
> > >> >>> implementing Option #3?
> > >> >>>
> > >> >>> Regards,
> > >> >>> Andor
> > >> >>>
> > >> >>>
> > >> >>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar <andor@cloudera.com
> >
> > >> wrote:
> > >> >>>
> > >> >>>> Did anybody happen to take a quick look by any chance?
> > >> >>>>
> > >> >>>> I don't want to push this too hard, because I know it's
a time
> > >> consuming
> > >> >>>> topic to think about, but this is a blocker in 3.5 which
has been
> > >> hanging
> > >> >>>> around for a while and any feedback would be extremely
helpful to
> > >> close it
> > >> >>>> quickly.
> > >> >>>>
> > >> >>>> Thanks,
> > >> >>>> Andor
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar <
> andor@cloudera.com
> > >
> > >> >>>> wrote:
> > >> >>>>
> > >> >>>>> Hi all,
> > >> >>>>>
> > >> >>>>> We need more eyes and brains on the following PR:
> > >> >>>>>
> > >> >>>>> https://github.com/apache/zookeeper/pull/451
> > >> >>>>>
> > >> >>>>> I added a comment few days ago about the way we currently
do DNS
> > >> name
> > >> >>>>> resolution in this class and a suggestion on how we
could
> simplify
> > >> things a
> > >> >>>>> little bit. We talked about it with Abe Fine, but
we're a little
> > >> bit unsure
> > >> >>>>> and cannot get a conclusion. It would be extremely
handy to get
> > more
> > >> >>>>> feedback from you.
> > >> >>>>>
> > >> >>>>> To add some colour to it, let me elaborate on the
situation
> here:
> > >> >>>>>
> > >> >>>>> In general, the task that StaticHostProvider does
is to get a
> list
> > >> of
> > >> >>>>> potentially unresolved InetSocketAddress objects,
resolve them
> and
> > >> iterate
> > >> >>>>> over the resolved objects by calling next() method.
> > >> >>>>>
> > >> >>>>> *Option #1 (current logic)*
> > >> >>>>> - Resolve addresses with getAllByName() which returns
a list of
> IP
> > >> >>>>> addresses associated with the address.
> > >> >>>>> - Cache all these IP's, shuffle them and iterate over.
> > >> >>>>> - If client is unable to connect to an IP, remove
all IPs from
> the
> > >> list
> > >> >>>>> which the original servername was resolved to and
re-resolve it.
> > >> >>>>>
> > >> >>>>> *Option #2 (getByName())*
> > >> >>>>> - Resolve address with getByName() instead which returns
only
> the
> > >> first
> > >> >>>>> IP address of the name,
> > >> >>>>> - Do not cache IPs,
> > >> >>>>> - Shuffle the *names* and resolve with getByName()
*every time*
> > when
> > >> >>>>> next() is called,
> > >> >>>>> - JDK's built-in caching will prevent name servers
from being
> > >> flooded
> > >> >>>>> and will do the re-resolution automatically when cache
expires,
> > >> >>>>> - Names with multiple IPs will be handled by DNS servers
which
> (if
> > >> >>>>> configured properly) return IPs in different order
- this is
> > called
> > >> DNS
> > >> >>>>> Round Robin -, so getByName() will return different
IP on each
> > call.
> > >> >>>>>
> > >> >>>>> *Options #3*
> > >> >>>>> - There's a small problem with option#2: if DNS server
is not
> > >> configured
> > >> >>>>> properly and handles the round-robin case in a way
that it
> always
> > >> return
> > >> >>>>> the IP list in the same order, getByName() will never
return the
> > >> next ip,
> > >> >>>>> - In order to overcome that, use getAllByName() instead,
shuffle
> > the
> > >> >>>>> list and return the first IP.
> > >> >>>>>
> > >> >>>>> All feedback if much appreciated.
> > >> >>>>>
> > >> >>>>> Thanks,
> > >> >>>>> Andor
> > >> >>>>>
> > >> >>>>>
> > >> >>>>>
> > >> >>>>>
> > >> >>>>
> > >> >>>
> > >> >
> > >>
> > >>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message