Locking shared resources based on a combination of the selected agent and the job type?

We have several Continua CI agents
Each agent should be able to execute a job of a certain type, but only 1 job for that type at a given time.
We are trying to find a way to use shared resource locks that will distribute the jobs on the agents based on agent+type combination availability:

Job of type X is triggered - if type X job is already running on agent “A” then send the job to one of the other available agents, where this type X job is NOT currently running.

What would be the best practice way of achieving this goal?
Thanks!

Hi Arik,

There does not currently appear to be good way to achieve this goal with Continua CI, so we are now looking into what we can change to make this possible using shared resource locks.

Meanwhile, can you elaborate on what you mean by “job” and “type”? Would “job” be a build or an individual stage? Does “type” refer to the configuration, project or a variable value?

Hi Dave,

First, thanks for looking into this for us!

By “job” I meant a Continua CI Configuration.
And by “type” I meant a certain type of a configuration that we would like to limit so it would only run 1 concurrent build instance per Continua CI agent.
This limitation requirement is a result of the way things on our end are set up:
These types of configurations each run a very heavy FinalBuilder project on the agent
These FinalBuilder projects operate on a very big git mono-monster-repositories that are pre-cloned on each agent in a dedicated directories.
Since I can’t have 2 builds of the same “type” (Same FinalBuilder project / same local repository path on the agent) running on the same local-on-agent git repository directory - I want the ability to distribute these builds over to any available agent (all of the agents hold the same copies of the same repos) where this type of configuration/build/job isn’t currently running.

If it matters for the sake of this discussion - our configurations are made up of only 1 stage - most of what this single stage does is just set some variables for FinalBuilder and trigger the FinalBuilder job to run on the agent.

Hi Arik,

We made some changes in version 1.9.2.754 so that the $Buiild.SharedResources$ expression is populated before the agent requirements are evaluated.

Now, if you set up a Server Shared Resource of type Quota List and enter a list of agent host names;

image

you can then set your stage to acquire a lock on any label (agent host name) from this shared resource.

To ensure that the stage runs on the agent with the same hostname as the acquired lock label, add an agent requirement. e.g. $Build.SharedResources.Server.Testing.MultiAgentResource.Labels.Any(Equals, $Agent.Hostname$)$ Equals True.

Note that you can substitute $Agent.Hostname$ for any other agent property (built-in, collected or custom). Using a carefully defined list of custom agent properties should allow you to globally change which agents are allocated to a stages for multiple configurations by editing the agent properties.

Let us know whether this works for your use case.

1 Like

Thank you Dave!
I’ve only recently had a chance to evaluate this:

First, Just wanted to note that agent conditions seems to break this:
Since agent conditions are probably evaluated after the stage’s agent requirements had sent the build to the agent matching the shared resource lock expression - if the matching agent has a condition on it that prevents this build from entering - the build will wait forever and not be sent to one of the other free agents that don’t have this condition.

Other than that - It seems to work as expected in case I only use $Agent.Hostname$;
Not sure how to utilize this for our more complex shared resource locking requirement.

I’ll try to better explain my desire, let me know if you think this is possible to achieve?

Say I have 3 agents: agent1,agent2,agent3
And I have 3 build configurations: C1,C2,C3

I’d like to enforce the following:

  • Only 1 build from a specific configuration name is allowed to run on a specific agent.
    So 1 agent can run C1,C2,C3 simultaneously but can’t accept another C1 while there’s a C1 running on it already - the new C1 will be routed to a different agent that’s not currently running a C1, and the same goes for all other configurations.

So far I’ve successfully utilized your agent requirement expression solution and it works as expected -
But I can’t figure out exactly how to add configuration lock that will allow different configurations to run simultaneously on the same agent and on the other hand prevent any specific configuration from instantiating more than 1 build on the same agent.

I don’t know if it matters - but I’m using shared resource locks in the configuration scope; not the stage scope, as our configurations are mostly just 1 stage each.
(The heavy lifting is done via FinalBuilder project that’s being run from the 1 stage)

=====

BTW, The next requirement / wish after that :slight_smile: is to also have the ability to lock based on the services that are being published by these configurations:

Each of the C1,C2,C3 configurations - when run - compiles and deploys 1 or more services from its services list.
(User selects which services to build and deploy in the queue options variable input form):
C1 can compile and deploy any of: C1_Svc1,C1_Svc2,C1_Svc3
C2 can compile and deploy any of: C2_Svc1,C2_Svc2,C2_Svc3
C3 can compile and deploy any of: C3_Svc1,C3_Svc2,C3_Svc3

So in addition to the previous requirement (distributing the builds for each configuration on all agents)
I’d like to handle situations like this one:

User A triggers configuration C1 and selects to deploy C1_Svc2 → runs on agent2 for example
User B triggers configuration C1 and selects to deploy C1_Svc2 → runs on agent3 for example

If we find a solution for the first requirement (agent + configuration lock) - then these 2 builds will run simultaneously on 2 different agents - but since its the same service that’s being deployed to our (QA) environment - they will attempt to deploy to the same destination VM at the same time, so I’d like to know if I can prevent this somehow.

Hi Arik,

You appear to be using Continua CI as a Continuous Deployment server rather than as a Continuous Integration server. The agents have really been designed as build runners rather than deployment targets.

You can, however, still achieve these aims, but you will need to create a shared resource list per configuration.

The agent conditions are run per stage, so for this scenario, with agent conditions, you need to acquire the shared resource locks at the stage scope.

Create three shared resources the same as Server.Testing.MultiAgentResource above. e.g. Server.Testing.MultiAgentResources.C1, Server.Testing.MultiAgentResources.C2 and Server.Testing.MultiAgentResources.C3. Set up each configuration as above, but using the relevant shared resource associated with the configuration for each shared resource lock and agent requirement.

Regarding your second requirement, you could add new quota list shared resources for each configuration e.g. Server.Testing.MultiServiceResources.C1, Server.Testing.MultiServiceResources.C2 and Server.Testing.MultiServiceResources.C3 each with items Svc1, Svc2 and Svc3. Add a configuration-level shared resource lock to each configuration. This would require a lock on a label of the shared resource associated with the configuration. Use the “acquire expression” operation to match the service selection variable with a label. e.g.

This lock is would be acquired before any agent is selected.

It’s possible that we could make changes in the future to make it possible to use operators such as “starts with” with the “acquire expression” shared resource lock operation. This would reduce the need for so many shared resource lists.

Thank you kindly for the detailed response Dave!

Sorry, I think I wasn’t 100% accurate:
We don’t deploy services ON our Continua CI Agents (If that’s what you meant here) - The Agents only build and compile the services, and then transfer them to some of our many VM servers and deploy them and run them there. (using massive FinalBuilder projects with winrs commands to control everything we need on the remote application servers).
(Also, we do not deploy to our production environment using triggers or other automated methods…
For our production deployment we usually run FinalBuilder directly to have more flow control over the entire process, these are only done every couple of weeks)

Thanks for clarifying that - if this is ever becomes a real issue for us then I’ll follow your suggestion here.

Thank you very much for this, it works just as expected! I should have though about this one myself :sweat_smile:

Thank you for this as well!
One thing though: each configuration has a list of services that a user checks in the queue options window using a “Checkbox Select” prompt type.
So after the user has submitted the job to the queue - we have a variable that holds a list of services to deploy.
Can the “Acquire Expression” operation acquire multiple labels?
If so - can you kindly assist us with building an expression that would try to acquire all labels matching all checked items from a “Checkbox select” type prompt?

No, it can only acquire one label. If you want to acquire more than one label repeat the line for each label.

There’s not any way to choose shared resource locks to acquire based on a dynamic list.

It may be possible to achieve your aims if you split the build into stages (one for each service). You can then use logic in control flow actions or skip conditions to define which stage is run based on your services list variable. This would also have the benefit of locking access to each service for a shorter time (e.g. time it takes a stage to build one service rather than several). You could still define your build workflow in a single FinalBuilder Project but use targets or variables to specify which set of actions to run for each stage.

We are using the “agent hostname as labels” locking mechanism for a couple of weeks now and it seems to work well!
I’d only wish there was a way to have this work and have the system select the least busy agent for the next job being queued.

Right now - as we have multiple configurations, each with their own multi-agent shared resource lock labels (all configurations are using the same label names - because these are our actual agent hostnames…)

Then what happens usually is:
When queueing configurations to build - they acquire the first available label on their relevant shared resource lock

Because “cciagent01” is first in the label order in each of the shared resource locks -
It is the one acquired for all configuration types.
The result is: all configurators are running on “cciagent01” for their first instantiated builds.

Now, I know I could probably set a different label order for the various configurations
and have some of them ordered differently (cciagent03 first for example).
But that’s not always working as expected (I save, reload the page and see the same order as if my changes were not saved - they are back in the original order where “cciagent01” is the first label).
Or: If It does work or If I delete the lock and re-create it again in the desired order -
I have to re-configure locking conditions for all the configurations that were using that lock.
(as the warning in the bottom of “Edit Shared Resource” states).

I’d ultimately wish for an option that would help us select the least busy agent
But I understand that labels aren’t really agents and that we already sort of abuse the mechanism to achieve things it wasn’t designed to do originally.
Will you consider adding an option (a server property I could set?) that will change the “acquire any label” operation behavior to select randomly from of the free labels on a shared resource quota list lock?

Hi Arik,

Can you confirm that you are using the the “Acquire Any” operation to acquire the shared resource lock from the quota list?

The labels currently do not have any order number in the database, so the first label which is chosen by this operation is currently arbitrary. Without any priority defined, it does make sense for the label to be chosen in random order, so we’ll make this change for the next version without requiring a server property to be set.

1 Like

I confirm, we indeed use “Acquire Any” as discussed earlier in this thread.

Thank you for that Dave! We’ll upgrade when it is released of course.

Hi Arik,

This has now been implemented in v1.9.2.978

1 Like