Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

Dynamicly launched self-hosted runners

Radu Cristescu April 25, 2025

Hi

I have this crazy plan that may just work. The way I understand it, Bitbucket says dynamic runners are not possible, and there has to be at least one runner available for any tag combination.

But here's what I'm thinking: dynamic pipelines, and some stuff that the Kubernetes autoscaler does with its use of undocumented APIs.

When the dynamic pipeline runs, there's no need for a runner to exist for any labels it will attach. So there's an opportunity to launch an AWS EC2 instance that's been set up just right to start a runner on boot, with the right UUID.

The most pressing problem is that I have only 25 seconds to do everything and return a pipeline, and that is non-negotiable as it's a hard Bitbucket limit.

So far, my tests show that I can start an unmodified Linux instance in about 20 seconds (sometimes it's 17). That leaves me with little wiggle room to do supporting operations.

I have 25 seconds to do this:

  1. Create a runner programatically
  2. Start an EC2 instance (or more, if I want to run parallel steps in parallel)
  3. Start the runner and have it register with Bitbucket
  4. Return a reply

And once the runner completes its work (whether succesffuly or not), I have to shut down and terminate the instance and remove the runner registration from Bitbucket (reuse of runner registrations doesn't seem like a good idea in a concurrent environment, as I don't see any locking mecanism that I can use to prevent double allocation). I don't know right now if there are any events, hooks, or stable log messages I can use to trigger the shutdown.

In my dynamic pipeline, I would generate UUID labels based on some scheduling logic. I can share the biggest `size` instance requested by all the sequential steps, but I have to start one instance per parallel step if I want to keep things running smoothly.

Please let me know if this looks workable or if it's terminally insane and I should look for other drugs :)

1 comment

Comment

Log in or Sign up to comment
Radu Cristescu April 28, 2025

A runner seems to take about 30 seconds to show up as ONLINE, so this won't cut it even if I make the EC2 boot up instantaneously, because I can't return the dynamic pipeline before the runner is registered, for my scenario to work.

Next idea:

Inject a step to register a runner, create an EC2, and wait for the runer to become ONLINE into pipelines based on their declared step sizes and parallelism.

That's 50 seconds to start up, so it's not ideal for short steps.

Still to investigate:

How to determine that the runner is idle and shut down the instance (from inside) - EC2 would be set to terminate on stop.

I could run an `after-script` to terminate the EC2 and delete the runner. This should be more reliable than tailing runner logs and looking for patterns, because the runner may be idle while waiting for e.g. a group of parallel steps to finish and return to the linear path, or polling the runner status endpoint (not documented, as used by the K8s autoscaler).

Better idea still: Make use of the REST APIs for pipeline and step status, as I see that BITBUCKET_STEP_UUID and BITBUCKET_PIPELINE_UUID are available. The EC2 can get these in user-data.

I will have to invent some clever scheduler logic to generate instructions for each step and EC2 to coordinate everything.

TAGS
AUG Leaders

Atlassian Community Events