Crash starting build after service restart


(Vincent Parrett) #21

There’s really not much we can do about this. We have looked at ways to reduce the memory usage when capturing process output, unfortunately the .net framework doesn’t make that easy. We’ll tkae another look at it.

(Vincent Parrett) #22

From memory the filtering is done after the list, as it uses regular expressions. Unfortunately subversion is pretty limited when it comes to listing branches and filtering on the command line (ie zero functionality). Git and Mercurial at least have built in functionality to list the branches.

(Simon Kennedy) #23

So its caching the entire list output in memory? Can’t stream to temp file? Seems odd that it does not apply the filter when reading from the output buffer. Most of the folders under /newlook/branches’ (~80%) is skipped by the repository filter.

(Vincent Parrett) #24

That’s something we’ll have to look at, the output comes in chunks, not always complete lines, so parsing on the fly is a bit more complicated. It is something we know about, but it hasn’t been an issue until now.

(Simon Kennedy) #25

I pretty sure you that .net will buffer output from a command line. Since ‘svn list’ command outputs one entry per line I would have thought it would be relatively straight forward to wait for a line and then run a pre-compiled reg-ex over it (discarding unmatched lines).

I however don’t know any of your internals, I’m merely speculating.

I guess Ill have to try bumping the memory on the machine and hope its enough.

(Vincent Parrett) #26

There are always things we could do better, but with limited resources we have to pick our battles.

(Vincent Parrett) #27

I had a look at the code, we use




which are read inside threads. They are StreamReaders, which does have a ReadLine method, however when reading large volumes of output the performance is terrible. To work around that, we call .Read with an 8K buffer and build up the output string.

Looking at this code, there’s definitely an opportunity to improve things here, but it will take some major refactoring, which is not without risk. I’ll discuss this with the lead dev (Dave) when he returns from leave.

(Simon Kennedy) #28

Unfortunately changing the memory limit of the server to 4GB does not resolve the issue :frowning:

(Vincent Parrett) #29

Ok, we will need a debug log so we can see where it’s failing. A diagnostics report would also be useful so we can see what settings you have. Hopefully we can set up a scenario to reproduce it here first, and profile to see exactly where the memory usage is.

(Simon Kennedy) #30

I’m thinking that our ‘/branches/peercodereviews’ folder is probably why this operation is choking. We almost daily add new folders (which are branches of trunk) to this folder. It’s possible that we hit some threshold where it takes too much memory to read all the file names into memory. Seems possible the initial failure may have caused Continua to reset that repository and otherwise we may not have seen this issue?

Ill start collecting logs/diagnostics.

(Vincent Parrett) #31

I did wonder why it was resetting the repo in the first place. That said, we’re looking at the code today with the hope of finding some ways to decrease memory usage.

(Vincent Parrett) #32

Hi Simon

I spent most of today looking at this, and attempting to rectify, but it’s going to take a while.

The code in question is used literally everywhere and the risk of breaking unrelated things with a quick fix is very high. Hopefully we’ll have something by the end of next week. In the mean time, I can only suggest temporarily increasing the memory on the vm until it gets past the repo reset phase.

(Simon Kennedy) #33

I’ve captured a log at time of crash plus diagnostics.

I’ve tried bumping the machine to 4GB however the Continua service hits ~2.5 GB and then crashes.

In the interim were going to disable the failing repository so that we can a -least perform some builds.

(Vincent Parrett) #34

Thanks, I have removed the attachment since it has email addresses in the log!

This is all we need for now I think.

(Dave Sparks) #35

This issue has now been fixed in version