knox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Risden (Jira)" <j...@apache.org>
Subject [jira] [Updated] (KNOX-843) Add support for load balancing across HA instances
Date Sat, 08 Feb 2020 22:57:00 GMT

     [ https://issues.apache.org/jira/browse/KNOX-843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kevin Risden updated KNOX-843:
------------------------------
    Description: 
*Summary*
Knox should support load balancing multiple clients across HA service instances.

*Problem Statement*
The HA mechanism in Knox today only allows failover. What this means is that for every client
that connects to Knox, the same backend is chosen even if there are multiple. This single
backend will be used until it fails and then Knox will chose another backend sending all clients
to this new backend.

The above design is simple, but has a few limitations:
* Even if the number of backends scale - there is no way for Knox to utilize this
* If any client causes a backend error - it causes Knox to failover all clients to a new backend

*Background Information*
There are 2 types of backend services - stateful and stateless. Stateful backend services
(like HiveServer2) require that once a client initiates a session that client is always sent
to that same backend. Stateless backend services (like HBase REST) don't require clients to
always be sent to the same backend. When stateful backend services are being used, load balancing
can only be done across multiple clients. 

There are two options to do stateful load balancing:
* Maintain the state at the server level and match clients to state
* Maintain the state at the client level and use that on the server side

Maintaining the state on the client side is more beneficial since then it can be used across
multiple Knox instances even in the case of Knox failover.

*Solution*
One option to maintaining client state, that is described in the original design doc here:
[^KNOX-843.pdf], is to store state in a cookie that gets sent back on the response to the
client. The general flow is as follows:
* clientA makes a request no KNOX_BACKEND cookie
* Knox checks if HA is enabled and dispatches request to a random backend
* On response to clientA - sets KNOX_BACKEND cookie to hash of server who serviced the response
* On second request clientA sends back the KNOX_BACKEND cookie
* Knox looks at request and dispatches request to backend that matches hash of KNOX_BACKEND
cookie
* if backend is down, Knox updates KNOX_BACKEND cookie with server that handled request on
response

This approach requires that clients support cookies and many do since this is how many load
balancers typically work today [1].

Some solution benefits include:
* Multiple clients can effectively use all the available backend services.
* A single backend going down won't affect all clients using Knox. Only affects clients using
that single backend.
* Client side cookie could be used against different Knox servers. This allows for Knox restarts
where clients bounce to another Knox server and maintain the backend session state.

Some solution limitations include:
* clients must support cookies - this should be the case today since many LB require cookies
for sticky sessions. So if a user is concerned with HA, there is most likely a sticky session
LB in front of Knox today.

*Implementation*
The HAProvider provides a per topology per service configuration to opt into this new feature.
Since this is a change in behavior we would want this to be an opt in optimization. The HAProvider
would provide a config that the dispatch can use to decide to insert and use cookies to influence
the dispatched request.

The cookie being sent back to the client needs to hide backend service information so internal
host details are not leaked. One option is to hash the backend before placing in the cookie.

This shouldn't affect failover since failover should be internal to Knox. The KNOX_BACKEND
cookie should be updated on the response to the client with the server that handled the request
in the case of a failover. The state changes should be on the client side response from Knox.
There are some edge cases here were failover will need to be decided potentially per client
instead of for the entire backend, but that shouldn't cause a big issue here.

*Benefits*

[1] https://www.haproxy.com/blog/load-balancing-affinity-persistence-sticky-sessions-what-you-need-to-know/

  was:
*Summary*
Knox should support load balancing multiple clients across HA service instances.

*Problem Statement*
The HA mechanism in Knox today only allows failover. What this means is that for every client
that connects to Knox, the same backend is chosen even if there are multiple. This single
backend will be used until it fails and then Knox will chose another backend sending all clients
to this new backend.

The above design is simple, but has a few limitations:
* Even if the number of backends scale - there is no way for Knox to utilize this
* If any client causes a backend error - it causes Knox to failover all clients to a new backend

*Background Information*
There are 2 types of backend services - stateful and stateless. Stateful backend services
(like HiveServer2) require that once a client initiates a session that client is always sent
to that same backend. Stateless backend services (like HBase REST) don't require clients to
always be sent to the same backend. When stateful backend services are being used, load balancing
can only be done across multiple clients. 

There are two options to do stateful load balancing:
* Maintain the state at the server level and match clients to state
* Maintain the state at the client level and use that on the server side

Maintaining the state on the client side is more beneficial since then it can be used across
multiple Knox instances even in the case of Knox failover.

*Solution*
One option to maintaining client state, that is described in the original design doc here:
[^KNOX-843.pdf], is to store state in a cookie that gets sent back on the response to the
client. The general flow is as follows:
* clientA makes a request no KNOX_BACKEND cookie
* Knox checks if HA is enabled and dispatches request to a random backend
* On response to clientA - sets KNOX_BACKEND cookie to hash of server who serviced the response
* On second request clientA sends back the KNOX_BACKEND cookie
* Knox looks at request and dispatches request to backend that matches hash of KNOX_BACKEND
cookie
* if backend is down, Knox updates KNOX_BACKEND cookie with server that handled request on
response

This approach requires that clients support cookies and many do since this is how many load
balancers typically work today [1].

Some solution benefits include:
* Multiple clients can effectively use all the available backend services.
* A single backend going down won't affect all clients using Knox. Only affects clients using
that single backend.
* Client side cookie could be used against different Knox servers. This allows for Knox restarts
where clients bounce to another Knox server and maintain the backend session state.

*Implementation*
The HAProvider provides a per topology per service configuration to opt into this new feature.
Since this is a change in behavior we would want this to be an opt in optimization. The HAProvider
would provide a config that the dispatch can use to decide to insert and use cookies to influence
the dispatched request.

The cookie being sent back to the client needs to hide backend service information so internal
host details are not leaked. One option is to hash the backend before placing in the cookie.

This shouldn't affect failover since failover should be internal to Knox. The KNOX_BACKEND
cookie should be updated on the response to the client with the server that handled the request
in the case of a failover. The state changes should be on the client side response from Knox.
There are some edge cases here were failover will need to be decided potentially per client
instead of for the entire backend, but that shouldn't cause a big issue here.

*Benefits*

[1] https://www.haproxy.com/blog/load-balancing-affinity-persistence-sticky-sessions-what-you-need-to-know/


> Add support for load balancing across HA instances
> --------------------------------------------------
>
>                 Key: KNOX-843
>                 URL: https://issues.apache.org/jira/browse/KNOX-843
>             Project: Apache Knox
>          Issue Type: New Feature
>          Components: Server
>            Reporter: Sam Hjelmfelt
>            Assignee: Kevin Risden
>            Priority: Major
>             Fix For: 1.5.0
>
>         Attachments: KNOX-843.001.patch, KNOX-843.002.patch, KNOX-843.patch, KNOX-843.pdf
>
>
> *Summary*
> Knox should support load balancing multiple clients across HA service instances.
> *Problem Statement*
> The HA mechanism in Knox today only allows failover. What this means is that for every
client that connects to Knox, the same backend is chosen even if there are multiple. This
single backend will be used until it fails and then Knox will chose another backend sending
all clients to this new backend.
> The above design is simple, but has a few limitations:
> * Even if the number of backends scale - there is no way for Knox to utilize this
> * If any client causes a backend error - it causes Knox to failover all clients to a
new backend
> *Background Information*
> There are 2 types of backend services - stateful and stateless. Stateful backend services
(like HiveServer2) require that once a client initiates a session that client is always sent
to that same backend. Stateless backend services (like HBase REST) don't require clients to
always be sent to the same backend. When stateful backend services are being used, load balancing
can only be done across multiple clients. 
> There are two options to do stateful load balancing:
> * Maintain the state at the server level and match clients to state
> * Maintain the state at the client level and use that on the server side
> Maintaining the state on the client side is more beneficial since then it can be used
across multiple Knox instances even in the case of Knox failover.
> *Solution*
> One option to maintaining client state, that is described in the original design doc
here: [^KNOX-843.pdf], is to store state in a cookie that gets sent back on the response to
the client. The general flow is as follows:
> * clientA makes a request no KNOX_BACKEND cookie
> * Knox checks if HA is enabled and dispatches request to a random backend
> * On response to clientA - sets KNOX_BACKEND cookie to hash of server who serviced the
response
> * On second request clientA sends back the KNOX_BACKEND cookie
> * Knox looks at request and dispatches request to backend that matches hash of KNOX_BACKEND
cookie
> * if backend is down, Knox updates KNOX_BACKEND cookie with server that handled request
on response
> This approach requires that clients support cookies and many do since this is how many
load balancers typically work today [1].
> Some solution benefits include:
> * Multiple clients can effectively use all the available backend services.
> * A single backend going down won't affect all clients using Knox. Only affects clients
using that single backend.
> * Client side cookie could be used against different Knox servers. This allows for Knox
restarts where clients bounce to another Knox server and maintain the backend session state.
> Some solution limitations include:
> * clients must support cookies - this should be the case today since many LB require
cookies for sticky sessions. So if a user is concerned with HA, there is most likely a sticky
session LB in front of Knox today.
> *Implementation*
> The HAProvider provides a per topology per service configuration to opt into this new
feature. Since this is a change in behavior we would want this to be an opt in optimization.
The HAProvider would provide a config that the dispatch can use to decide to insert and use
cookies to influence the dispatched request.
> The cookie being sent back to the client needs to hide backend service information so
internal host details are not leaked. One option is to hash the backend before placing in
the cookie.
> This shouldn't affect failover since failover should be internal to Knox. The KNOX_BACKEND
cookie should be updated on the response to the client with the server that handled the request
in the case of a failover. The state changes should be on the client side response from Knox.
There are some edge cases here were failover will need to be decided potentially per client
instead of for the entire backend, but that shouldn't cause a big issue here.
> *Benefits*
> [1] https://www.haproxy.com/blog/load-balancing-affinity-persistence-sticky-sessions-what-you-need-to-know/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message