8

I'm experimenting a little with ACS using the DC/OS orchestrator, and while spinning up a cluster within a single region seems simple enough, I'm not quite sure what the best practice would be for doing deployments across multiple regions.

Azure itself does not seem to support deploying to more than one region right now. With that assumption, I guess my only other option is to create multiple, identical clusters in all the regions I wish to be available, and then use Azure Traffic Manager to route incoming traffic to the nearest available cluster.

While this solution works, it also causes a few issues I'm not 100% sure on how I should work around.

  1. Our deployment pipelines must make sure to deploy to all regions when deploying a new version of a service. If we have a East US and North Europe region, during deployments from our CI tool I have to connect to the Marathon API in both regions to trigger the new deployments. If the deployment fails in one region, and succeeds in the other, I suddenly have a disparity between the two regions.
  2. If i have a service using local persistent volumes deployed, let's say PostgreSQL or ElasticSearch, it needs to have instances in both regions since service discovery will only find services local to the region. That brings up the problem of replication between regions to keep all state in all regions; this seem to require some/a lot of manual configuration to get to work.

Has anyone ever used a setup somewhat like this using Azure Container Service (or really Amazon Container Service, as I assume the same challenges can be found there) and have some pointers on how to approach this?

Trond Nordheim
  • 1,406
  • 2
  • 12
  • 13

2 Answers2

1

You have multiple options for spinning up across regions. I would use a custom installation together with terraform for each of them. This here is a great starting point: https://github.com/bernadinm/terraform-dcos

Distributing agents across different regions should be no problem, ensuring that your services will keep running despite failures.

Distributing masters (giving you control over the services during failures) is a little more diffult as it involves distributing a zookeeper quorum across high latency links, so you should be careful in choosing the "distance" between regions.

Have a look at the documentation for more details.

js84
  • 3,551
  • 1
  • 17
  • 22
0

You are correct ACS does not currently support Multi-Region deployments.

Your first issue is specific to Marathon in DC/OS, I'll ping some of the engineering folks over there to see if they have any input on best practice.

Your second point is something we (I'm the ACS PM) are looking at. There are some solutions you can use in certain scenarios (e.g. ArangoDB is in the DC/OS universe and will provide replication). The DC/OS team may have something to say here too. In ACS we are evaluating the best approaches to providing solutions for this use case but I'm afraid I can't give any indication of timeline.

An alternative solution is to have your database in a SaaS offering. This takes away all the complexity of managing redundancy and replication.

rgardler
  • 594
  • 2
  • 7