helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore g <g.kish...@gmail.com>
Subject Re: Dynamically configuring instances
Date Tue, 26 Feb 2013 16:46:14 GMT
This is the JIRA I had created for this requirement

On Tue, Feb 26, 2013 at 8:30 AM, kishore g <g.kishore@gmail.com> wrote:

> Hi Vinayak,
> We have encountered a similar scenario at LinkedIn.
> In one case they have predefined set of ports per host and nodes are
> created in Helix upfront for those ports. When the nodes start up they
> check if no one has already taken up that instance (host:port), if not they
> join the cluster with the host name.
> Another use case, they come up with the name (host_port) based on whats
> available and create the instance using helix admin and then join the
> cluster.
> The reasoning behind having a consistent naming scheme is to provide a
> consistent mechanism of assigning partition to nodes even after restarts.
> This is important for stateful systems where we dont want to move the data
> on restarts. Another (not really technical but more practical) reason is to
> avoid rogue instances connecting to the cluster with random id due to code
> bugs or misconfiguration.
> But we dont really enforce having a host and port as part of the instance
> name. ( at least not by design). All we enforce is a unique name for each
> instance across the cluster. So a node can come up with a unique id, add
> itself to the cluster. It can still set its information of port, host etc
> in the config so that its discoverable by other nodes in the system.
> Probably in your case, you should also drop the instance on disconnect.
> NOTE: there are some command line tools that assume the host:port format,
> those are bugs and need to be fixed. But if you use the java api directly
> you should not have a problem. In any case let us know if you have problem
> setting your own unique id.
> This requirement has come up multiple times at LinkedIn and on other
> threads. Will a feature  like auto create instance on join and delete on
> leave be help ful. We can have this flag set at cluster level when the
> cluster is created so we can throw exception if the flag is set is false
> and node is not already created.
> Thanks
> Kishore G
> On Mon, Feb 25, 2013 at 7:28 PM, Vinayak Borkar <vborky@yahoo.com> wrote:
>> Hi Shi,
>> Thanks for your response.
>> The Helix documentation suggests that the recommended way to name
>> instances is to use a combination of host name and port number. I suppose
>> adding the port number allows multiple instances to run on the same
>> machine. However, this also means that each instance needs to be provided a
>> dedicated port number that is known upfront and is ensured to be stable
>> across cluster restarts.
>> In my particular application, no assumption is made about the
>> availability of specific ports on the machine that runs the agent. Instead,
>> the agent on startup opens a socket with port 0, getting a free port
>> assigned to the socket, which is then used for further communication with
>> that agent for the duration that the agent is alive. This strategy of not
>> depending on specific ports allows us to run multiple agents on the same
>> machine (mostly for testing) without worrying about the agents trying to
>> bind to the same port for RPC. In production this scheme let's our agents
>> run on the server machines without regards to what ports are available and
>> allowing for zero-configuration.
>> I am in the process of porting this application to use Helix as the
>> cluster management platform and trying to figure out what the best way
>> would be to do so. To get around the problem, I think I will need to figure
>> out a more stable way to name my instances so that they maintain their name
>> regardless of which port they are bound to.
>> Have you encountered other use cases that needed an alternate way to name
>> the instances instead of using hostname and port numbers?
>> Thanks,
>> Vinayak
>>  Hi Vinayak:
>>> In this scenario, Helix admin command / API (see
>>> http://helix.incubator.apache.**org/Tutorial.html<http://helix.incubator.apache.org/Tutorial.html>)
>>> can be used to add the
>>> instance with the new generated name into the cluster, and then the
>>> instance can start with the name. But doing this may require the
>>> idealstate
>>> of the resource hosted in the helix cluster be re-calculated after the
>>> new
>>> instance is added, unless the resource is in auto-rebalance mode.
>>> Can you share some more details about your use case?
>>> Thanks,
>>> -Shi
>>> On Mon, Feb 25, 2013 at 1:29 PM, Vinayak Borkar <vborky@yahoo.com>
>>> wrote:
>>>  Hi guys,
>>>> I am trying to use Helix in a system where the "instances" start up and
>>>> listen on a free port that is not pre-configured before the application
>>>> starts -- This is done so that the application does not rely on the
>>>> availability of specific ports. As a result, the instance name (host,
>>>> port)
>>>> are not know upfront. However, Helix requires the instance be created in
>>>> Helix before it connects. Any ideas to get out of this situation? Is
>>>> there
>>>> a way to tell Helix to create an instance on receiving a connection
>>>> from an
>>>> instance?
>>>> Thanks,
>>>> Vinayak

View raw message