Repository transfer to remote agent fails

Hi,

I setup  Continua CI (Beta 1.5.0.51) with a remote agent. The transfer of the repository files fails with this error message “The server repository cache at ssh://192.168.182.37:9010// is missing”.

These are the messages and stack traces from the debug log:

Message: Running c:\program files\vsoft technologies\continuaci agent\hg\hg.exe with arguments “id -r 0 ssh://192.168.182.37:9010// --ssh ““c:\program files\vsoft technologies\continuaci agent\Putty\plink.exe” -ssh -C -l f31d12a1b92341f2bd19a55c3b50ec17 -pw a885cd3ef8884df4abe9b75d36660a19 -batch” --config ui.username=Continua --noninteractive” on agent failed with return code 255 and error output: "abort: no suitable response from remote hg!
"
Stack Trace:    at Continua.Shared.Utils.Mercurial.Run(ProcessArguments args, String workingDir, Func2 checkResult, Boolean runRecoverIfRequired, Boolean allowTermination)<br>&nbsp;&nbsp; at Continua.Shared.Utils.Mercurial.GetFirstRevisionId(String repository)<br>&nbsp;&nbsp; at Continua.Modules.Builds.Agent.FileSync.AgentRepositoryCache.CheckServerRepositoryCacheExists(String sourcePath)<br><br><br>Medium: 3:29:29 PM &#91;Debug&#93; SyncFrom >> Server cache missing - cleaning agent cache folder DirectoryPath=C:\ContinuaCI\Workspace\Repos\eb8efea7<br>Medium: 3:29:29 PM &#91;Debug&#93; SyncFrom >> Throwing exception: The server repository cache at ssh://192.168.182.37:9010// is missing<br>Medium: 3:29:29 PM &#91;Debug&#93; SyncSourceFromServer >> End<br>Medium: 3:29:29 PM &#91;Debug&#93; InitialiseWorkspaceOnAgent >> Exception: An error occurred while syncing files from the server to the agent. Details: Exception: Exception<br>Message: The server repository cache at ssh://192.168.182.37:9010// is missing<br>Stack Trace:&nbsp;&nbsp;&nbsp; at Continua.Modules.Builds.Agent.FileSync.AgentRepositoryCache.<>c__DisplayClassa.<syncfrom>b__8()<br>&nbsp;&nbsp; at Continua.Shared.Utils.ReadWriteLockList1.WithWriteLock(TId id, CancellationTokenSource cancelTokenSource, Action action)
   at Continua.Modules.Builds.Agent.FileSync.AgentRepositoryCache.SyncFrom(TransportContextDTO source, String cacheRevision)
   at Continua.Modules.Builds.Agent.AgentRepositoryHelper.SyncCache(String cacheRevision)
   at Continua.Modules.Builds.Agent.AgentBuildHelper.SyncRepoCache(BuildRepositoryDTO buildRepositoryDTO, AgentRepositoryHelper repositoryHelper)
   at Continua.Modules.Builds.Agent.AgentBuildHelper.SyncSourceFromServer(IEnumerable1 rules, AgentWorkspaceSyncContext workspaceCtx)<br>&nbsp;&nbsp; at Continua.Modules.Builds.Agent.AgentBuildHelper.InitialiseWorkspaceOnAgent(AgentCallbackProxy proxy, TransportContextDTO source, Guid callId)<br><br>.<br>Medium: 3:29:29 PM &#91;Debug&#93; &#91;Background Monitor&#93; Running background monitor 'Agent Callback Proxy'<br>Medium: 3:29:29 PM &#91;Debug&#93; &#91;AgentCallbackProxy&#93; Monitoring call queue. There are 1 items on queue.<br>Medium: 3:29:29 PM &#91;Debug&#93; Creating Agent Callback service<br>Medium: 3:29:29 PM &#91;Debug&#93; Creating Agent Callback service, Create IP Client. Hostname: '192.168.182.37', Port: '9000'.<br>Medium: 3:29:29 PM &#91;Agent Build Runner&#93; Error initialising workspace: Exception: Exception<br>Message: The server repository cache at ssh://192.168.182.37:9010// is missing<br>Stack Trace:&nbsp;&nbsp;&nbsp; at Continua.Modules.Builds.Agent.FileSync.AgentRepositoryCache.<>c__DisplayClassa.<syncfrom>b__8()<br>&nbsp;&nbsp; at Continua.Shared.Utils.ReadWriteLockList1.WithWriteLock(TId id, CancellationTokenSource cancelTokenSource, Action action)
   at Continua.Modules.Builds.Agent.FileSync.AgentRepositoryCache.SyncFrom(TransportContextDTO source, String cacheRevision)
   at Continua.Modules.Builds.Agent.AgentRepositoryHelper.SyncCache(String cacheRevision)
   at Continua.Modules.Builds.Agent.AgentBuildHelper.SyncRepoCache(BuildRepositoryDTO buildRepositoryDTO, AgentRepositoryHelper repositoryHelper)
   at Continua.Modules.Builds.Agent.AgentBuildHelper.SyncSourceFromServer(IEnumerable1 rules, AgentWorkspaceSyncContext workspaceCtx)<br>&nbsp;&nbsp; at Continua.Modules.Builds.Agent.AgentBuildHelper.InitialiseWorkspaceOnAgent(AgentCallbackProxy proxy, TransportContextDTO source, Guid callId)<br>&nbsp;&nbsp; at Continua.Modules.Builds.Agent.AgentBuildRunner.OnInitialisingWorkspace(Transition1 inState)

Regards

Kay Zumbusch

Hi Kay,

We’re working on replicating this issue. Can you check that the user the SSH service is running under has write access to the data share folder on the server? 

Hi Dave,

The permissions look fine. ‘Everyone’ has full access to the directory and the network share. And the SSH server is running under local system.

Regards

Kay Zumbusch

Hi Kay,

We’ve added some event log messages to the SHD service in the new beta build. This will be uploaded to our download page shortly. Can you install this and then check for any ContinuaCISSH Windows event log entries related to the data share. If it says that the data share is blank, try adding the following key to Continua.Server.Service.exe.config pointing to the local path of your data share folder.

<add key=“Continua.Builds.DataShareLocalPath” value=“C:\ContinuaShare” />

Restart the Continua.SSH service and then check the event log once again.

Hi Dave,

I installed the new beta release and now everything seems to work fine over SSH. I disabled access to the network share and started a new build. This time working copy was transfered via SSH and the whole build finished successfully. I will try to reproduce the formerly mentioned problem with a new repository. Maybe it’s just the initial transfer to the agent that fails.

Regards

Kay Zumbusch

Hi Dave,

I just got the error message again but I have no additional log information I can provide. The SSH service seems to get the correct path to the server’s workspace. I found the following event log message:

Log Name:      Application
Source:        ContinuaCISSHD
Date:          7/15/2014 3:01:29 PM
Event ID:      0
Task Category: None
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      srv007.domain.local
Description:
Data Share Path : C:\ContinuaCI

The local path to the data share is correct. Neither ContinuaCI server nor the SSH service logged error messages to Windows event log. The messages in the error log file are slightly different but the stack traces are the same:

Medium: 3:02:29 PM [Debug] [ServerAgentController.CompleteInitialisation] callId e14a7f3b-b3a9-4606-aeb3-fdfa2c34c0e6, result : Error
Medium: 3:02:29 PM [Debug] [ServerAgentController.CompleteInitialisation] done
Medium: 3:02:29 PM [Debug] [ServerAgentController.MonitorStageExecution] refreshing agent object from db, LastActive : 7/15/2014 1:02:02 PM
Medium: 3:02:29 PM [Debug] [ServerAgentController.MonitorStageExecution] agent refreshed, LastActive : 7/15/2014 1:02:02 PM
Medium: 3:02:29 PM [Debug] [ServerAgentController.MonitorStageExecution] Init Workspace Completed
Medium: 3:02:29 PM [Debug] [ServerAgentController.MonitorStageExecution] call.result : Error, stopRequested :False
Medium: 3:02:29 PM [Debug] [ServerAgentController.Execute] Async call result is Error
Medium: 3:02:29 PM [Debug] [ServerAgentController.Execute] Async call error message is : Error initialising workspace
Exception: Exception
Message: The server repository cache at ssh://192.168.182.37:9010// is missing
Stack Trace:    at Continua.Modules.Builds.Agent.FileSync.AgentRepositoryCache.<>c__DisplayClassa.b__8()
   at Continua.Shared.Utils.ReadWriteLockList1.WithWriteLock(TId id, CancellationTokenSource cancelTokenSource, Action action)<br>&nbsp;&nbsp; at Continua.Modules.Builds.Agent.FileSync.AgentRepositoryCache.SyncFrom(TransportContextDTO source, String cacheRevision)<br>&nbsp;&nbsp; at Continua.Modules.Builds.Agent.AgentRepositoryHelper.SyncCache(String cacheRevision)<br>&nbsp;&nbsp; at Continua.Modules.Builds.Agent.AgentBuildHelper.SyncRepoCache(BuildRepositoryDTO buildRepositoryDTO, AgentRepositoryHelper repositoryHelper)<br>&nbsp;&nbsp; at Continua.Modules.Builds.Agent.AgentBuildHelper.SyncSourceFromServer(IEnumerable1 rules, AgentWorkspaceSyncContext workspaceCtx)
   at Continua.Modules.Builds.Agent.AgentBuildHelper.InitialiseWorkspaceOnAgent(AgentCallbackProxy proxy, TransportContextDTO source, Guid callId)
   at Continua.Modules.Builds.Agent.AgentBuildRunner.OnInitialisingWorkspace(Transition1 inState)<br><br><br>Medium: 3:02:29 PM &#91;Debug&#93; &#91;ServerAgentController.Execute&#93; Async call Done<br>Medium: 3:02:29 PM &#91;Debug&#93; &#91;ServerAgentController.Execute&#93; Done<br>Medium: 3:02:29 PM &#91;Stage Controller&#93; There was an error with stage: @link(1004, af2eab19-08de-4721-960a-a36900f7d9f6)&#91;Build&#93;. Message: Error initialising workspace <br>Exception: Exception<br>Message: The server repository cache at ssh://192.168.182.37:9010// is missing<br>Stack Trace:&nbsp;&nbsp;&nbsp; at Continua.Modules.Builds.Agent.FileSync.AgentRepositoryCache.<>c__DisplayClassa.<syncfrom>b__8()<br>&nbsp;&nbsp; at Continua.Shared.Utils.ReadWriteLockList1.WithWriteLock(TId id, CancellationTokenSource cancelTokenSource, Action action)
   at Continua.Modules.Builds.Agent.FileSync.AgentRepositoryCache.SyncFrom(TransportContextDTO source, String cacheRevision)
   at Continua.Modules.Builds.Agent.AgentRepositoryHelper.SyncCache(String cacheRevision)
   at Continua.Modules.Builds.Agent.AgentBuildHelper.SyncRepoCache(BuildRepositoryDTO buildRepositoryDTO, AgentRepositoryHelper repositoryHelper)
   at Continua.Modules.Builds.Agent.AgentBuildHelper.SyncSourceFromServer(IEnumerable1 rules, AgentWorkspaceSyncContext workspaceCtx)<br>&nbsp;&nbsp; at Continua.Modules.Builds.Agent.AgentBuildHelper.InitialiseWorkspaceOnAgent(AgentCallbackProxy proxy, TransportContextDTO source, Guid callId)<br>&nbsp;&nbsp; at Continua.Modules.Builds.Agent.AgentBuildRunner.OnInitialisingWorkspace(Transition1 inState)

Hi Kay,

Just to let you know that we’re still working on this. We’ve replicated the issue here today on one server, and have spent most of the day attempting to figure out what is different - without success.

We hope to have more to report tomorrow.

Hi Dave,

there’s no need to rush. You guys did incredible work with Continua CI so far and a problem is better fixed thoroughly than twice. We worked around our limitations accessing the network share by reducing the security level on the CI server and are okay for now.

Regards

Kay Zumbusch

Hi Kay,

We’ve made some fixes to the SSH service related to the way that we are specifying the remote hostname and the path to hg.exe. This fixes this issue that we were seeing on one server and we should have a new beta build out on Monday once we’ve done some more testing on another update. 

Hi Dave,

I still can’t get SSH transfers working reliably. After updating to 168 I restarted server and agents and they claimed to be able to use UNC and SSH for transfers. After I disabled the local user account on the Continua CI server used for UNC transfers both agents claimed to support SSH but no UNC transfer. So far so good. I might add that our build servers are standalone workgroup servers while our Continua CI server is a member server of our Active Directory. Therefore we added a local user account to the Continua CI server with the same username and password as on the build servers to work around the SSH issues.

After I successfully ran a build using SSH transfer any further transfers failed and both agents no longer support SSH transfers. I was not able to open a connection to the SSH server using PuTTy. The agents’ properties showed no support for UNC and SSH. What’s really strange about it that both agents lost their SSH capabilities although only one of the build servers were used for the two builds. After restarting the SSH service both agents got their SSH capabilities back and I get a login prompt . It looks like the SSH server crashes in certain conditions. On the next build the SSH connectivity got lost during workspace sync to the agent.

For now we’ll fallback to UNC transfer using the local user account on our Continua CI server but we would like to get SSH transfers working.

Regards

Kay Zumbusch

Hi Kay,

Are there any errors in the Windows event log on the server relating to the ContinuaCISSHD

Hi Dave,

the only error message I found is this:

Error Stopping SSH : Exception : EOleException
System call failed

There are no SSH error messages in the server’s or the agent’s log file. I had our server’s and agents’ log file activated permanently.

Kind regards

Kay Zumbusch

Hi Kay,

This error doesn’t really tell us much, so I’ve uploaded a new version 1.5.0.179.


This includes a change to the SSH service to write to a debug log. To enable, insert <add key=“Continua.SSHD.DebugLogFolder” value=“C:\logfolder” />  into the appSettings node in Continua.Server.Service.exe.config before restarting the service.

If you could replicate the issue with debugging enabled and send the log to support@finalbuillder.com, then this will hopefully allow us to identify what is stopping the service. Writing to the debug log will slow the service down so remember to remove it once we’re done.