Apologies if this comes off negative, that's not the intent.
Some questions/concerns about repos and their error states. Stating the obvious, if a repository poll fails for any reason it goes into an error state - and fair enough. What I'm finding frustrating is that network related errors have to be manually cleared, even if a subsequent poll is successful. In situations where remote repositories are involved it's proving extremely troublesome. Modem reboots, server reboots etc all cause repository errors the require manual intervention even though they are only minor interruptions (I realise that you can set exclusion periods in Continua, unfortunately in some cases the times are not clear cut).
Having to manually and actively monitor the repositories to ensure triggers will queue builds successfully is proving to be a real problem (overnight triggers are particularly annoying).
Is it not possible to have a repository 'self-heal' and clear it's own error state if the next poll is successful? Having said that, I can think of numerous reasons why this would be difficult (especially given the number of available version control systems), but I'm struggling to work out how I can help Continua be more self-managing, without requiring daily intervention to clear repo errors (sadly in my case it is multiple times daily).
My recollection of how it works is that when a repo errors, it is put into an error state for a period of time, and then it is retried (this was added recently). I suspect that’s not working 100% as designed, we’ll take a look and do some more testing.
I’m not sure if you noticed but we added an option to triggers called “Force Repository Check”. It’s on by default so you may not have had to modify it. Basically what it does is, when the trigger starts a build it forces a check on all repositories attached to that Configuration. By forcing a check you get the behavior you were after… intermittent faults are usually sorted out when the build runs and the build will only continue to run when all repositories are in a proper working state. The flip side of that is… if the repository is still in an error state when those checks happen then the build will fail.
If you’ve done some digging around behind the scene you might’ve noticed Continua caches each repository you create. When you disable “Force Repository Check”, the trigger will start a build using the latest version in the repository cache and it won’t go off and check for the latest version for your repository. What this could mean is your build has been built using potentially old data from your repository. The good thing is your build still happens even though repositories may be done. This method works well in a few situations… especially for us in house. We have a source repository and a binaries repository. The source repository is updated frequently, the binaries very rarely. We don’t want a build to stop just because the binaries repository can’t be reached because there’s a high chance nothing has changed since the last build.
To make sure a certain repository isn’t in an error state, you could create a trigger for that repository. That will ensure that specific repository is not in an error state… but by disabling the check you will allow all the others to be in an error state so you’d use the latest version in the cache. I know you mentioned you had about 8 repositories and a nightly trigger. In that situation (and with disabling force repository check), there’s no way to specify a certain repository to checked and the other not to… it’s all or nothing with time triggers.
Do you think it would be useful, for your timed triggers to specify which repositories should be updated to the latest and which can just use the latest cached version if it can’t be updated?
Our main reason to use ‘Force Repository Check’ is to clear error states rather than get latest changes, and unfortunately for us that means we’d be running it across all repos regardless (we just want something to build). In that vain, there are repos we would be happy to have in error state when the build starts (and just use their cached versions). So what may be more useful in our specific situation is:
1. ability to force repo checks (which is what we have right now - ‘all or nothing’ is fine) 2. ability to select which repos are OK to use their cached versions if (even after a forced check) they are still in an error state
I feel in saying that though that we’re hijacking forced checks for our own purposes rather than what they were more likely intended for, which is ensuring the latest changes are in the build. If there was a ‘Force check on repos in error state’ option, it would more accurately describe our intentions (though the blanket forced checks ‘all or nothing’ is really fine).
Hope that helps? And thanks for the info on what’s going on behind the scenes. It’s helped confirm my suspicions.