[1.6.0.132] Unable to initialising workspace on a remote agent

Hi,

We have a particular Continua configuration with quite a big SVN repository that we are able to run in on a local agent. We had some timeout issues before that were solved by ‘top-level branches’ feature and by increasing timeout on a repository. But we are not able to use it with concurrent builds as workspace initialization timeouts on a first run and there is no obvious way to increase stage time outs. The following error is logged:

"Stage Controller

There was an error with stage: Build. Message: Error initialising workspace
Exception: ProcessException
Message: Running C:\Program Files\VSoft Technologies\ContinuaCI Agent\hg\hg.exe with arguments “clone --noupdate \server\Continua\Rc\d199e2b3 C:\CI_WS\Repos\d199e2b3 --config ui.username=Continua --noninteractive” on agent failed with return code -1 and error output: "The process timed out or was forcibly terminated after 1 hour, 0 minutes. The timeout was 1 hour, 0 minutes."
Stack Trace: at Continua.Shared.Utils.Mercurial.Run(ProcessArguments args, String workingDir, Func2 checkResult, Boolean runRecoverIfRequired, Boolean allowTermination)<br>at Continua.Shared.Utils.Mercurial.Clone(String remoteUrl, String localUrl)<br>at Continua.Modules.Builds.Agent.FileSync.AgentRepositoryCache.<>c__DisplayClassa.<syncfrom>b__8()<br>at Continua.Shared.Utils.ReadWriteLockList1.WithWriteLock(TId id, CancellationTokenSource cancelTokenSource, Action action)
at Continua.Modules.Builds.Agent.FileSync.AgentRepositoryCache.SyncFrom(TransportContextDTO source, String cacheRevision)
at Continua.Modules.Builds.Agent.AgentRepositoryHelper.SyncCache(String cacheRevision)
at Continua.Modules.Builds.Agent.AgentBuildHelper.SyncRepoCache(BuildRepositoryDTO buildRepositoryDTO, AgentRepositoryHelper repositoryHelper)
at Continua.Modules.Builds.Agent.AgentBuildHelper.SyncSourceFromServer(IEnumerable1 rules, AgentWorkspaceSyncContext workspaceCtx)<br>at Continua.Modules.Builds.Agent.AgentBuildHelper.InitialiseWorkspaceOnAgent(IAgentCallbackProxy proxy, TransportContextDTO source, Guid callId)<br>at Continua.Modules.Builds.Agent.AgentBuildRunner.OnInitialisingWorkspace(Transition1 inState) "

Regards,
Ilya

Hi Ilya,

Can you tell us more about the structure of your repository? If the repository is extremely large, it may be best to split it into separate Continua repositories each with the default and branches paths set to different sub folders.

Are all folders required for the build? If not, it may be worth using Exclude Patterns to reduce the number of files exported to the repository cache.

 I’ll also look into whether we can increase the timeout, however one hour is already a long time to wait for the build to initialise. 

Can you also let us know what version of Continua you are running, the size of the repository cache folder (e.g. \server\Continua\Rc\d199e2b3) and the size of the Subversion working copy? 

Hi Dave,

We use 1.6.0.132 for both server and agents. SVN working copy is about 12GB in size and has about 105k files and 5k folders. Mercurial repo is about 16.3GB in size.

I want to highlight that this 1hr timeout is only an issue for the first run for a particular configuration. Say, local agent did initial deploy in about 50 mins and subsequent update/checkouts are done in 10 mins, so this is a specific issue with initial deploy and not with normal runs. And adding a remote agent into equation slows things down significantly, so it will hit 1hr threshold which leads to a hg clone process being killed by agent.

Unfortunately, there is no way to build against a specific changeset in Continua and we had issues before where it will ‘lose’ some latest changes, so we prefer not to split repository and/or more complexity even it build run is not using all 100% of the files. And it looks like Exclude Patterns will not help during clone operation, unless I’m missing something here.

We are not arguing against 1hr timeout per se, but it seems to be rather arbitrary and should be adjustable or completely waived for initial clone at least. Because if I simply copy exact hg clone command revealed in process explorer and run it manually there will nothing that prevent successful completion in less than 2hr.

P.S.
Force killing of hg process by agent leaves lock file in servers’s repo and prevents further updates and runs for configuration that uses repo in question and this is not exactly optimal. I believe that agent should clean up locks after killing it’s child process.

Thanks,
Ilya