We're running Bamboo OnDemand builds on Amazon EC2 machines with VPC. For some reason, the VPC network connection isn't 100% reliable, and so occasionally a given TCP connection attempt will fail.
In our VM startup script, we need to download and setup a bunch of stuff via the VPC connection. Most of the time, that works as expected. But occasionally, when a connection drops, the script fails. When this happens, the VM is not suitable for running builds.
But Bamboo OnDemand doesn't *know* that the VM isn't suitable for running builds, so it tries scheduling a build there anyway. Usually the build fails pretty fast, for a reason having nothing to do with the code that it's building. If there's a large queue of waiting builds, bad VMs will tend to churn through the queue, quickly failing all of the builds there.
Is there a way for the VM to send a signal back to Bamboo saying that the startup script failed and that the agent on that machine should be disabled?
If your script fails, just do:
rm -rf /opt/bamboo-elastic-agent
killall -9 java
That will prevent the agent from starting and will kill any started agent. Bamboo will notice that the agent went down and remove from the list eventually (no builds will be sent to the agent in any case).
Online forums and learning are now in one easy-to-use experience.
By continuing, you accept the updated Community Terms of Use and acknowledge the Privacy Policy. Your public name, photo, and achievements may be publicly visible and available in search engines.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.