AFAIK, autoscaler will strict to the min/max specified. We don't currently have the way of identifying whether actual IaaS has enough quota. If one partition failed as it is exhausted with the resources, then autoscaler can try to switch to next partition. However, I'm not sure whether we support this now. But this would be a nice to have feature. So that autoscaler won't blindly create instances in the exhausted partition all the time.
As a workaround, we can configure the min/max to satisfy the actual IaaS level resources. So that we can assume that the autoscaler will not create the instances beyond the limit.
FYI: If the autoscaler move an instance to terminating list, then the member which is being terminated will not be counted into the currently available instances. So, whenever autoscaler failed terminate an instance, then autoscaler can keep on creating new instances in order to satisfy the min/max as the terminating instance will not be counted.