Continua CI Server service takes long time to start, reports postgres error messages


Hi, I’m attempting to upgrade from 1.5 to latest 1.7 and have run into issues starting the server after installation. I see multiple errors after each attempt to start the service but
each time it takes a really long time to start (windows itself reports that as an error). The first error I see is a Postgres timeout error:

The description for Event ID 0 from source PostgreSQL cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event: 

ERROR:  canceling statement due to statement timeout
STATEMENT:  ALTER TABLE public.builds_unittest ALTER assemblyname TYPE varchar(256), ALTER assemblyname SET NOT NULL

Followed by a error from Continua Server service itself:

Could not initialise database:

System.Exception: An error occurred executing the following sql:
ALTER TABLE public.builds_unittest ALTER assemblyname TYPE varchar(256), ALTER assemblyname SET NOT NULL
The error was ERROR: 57014: canceling statement due to statement timeout
 —> Npgsql.NpgsqlException: ERROR: 57014: canceling statement due to statement timeout
   at Npgsql.NpgsqlState.d__0.MoveNext()
   at Npgsql.ForwardsOnlyDataReader.GetNextResponseObject(Boolean cleanup)
   at Npgsql.ForwardsOnlyDataReader.GetNextRowDescription()
   at Npgsql.ForwardsOnlyDataReader.NextResultInternal()
   at Npgsql.ForwardsOnlyDataReader…ctor(IEnumerable1 dataEnumeration, CommandBehavior behavior, NpgsqlCommand command, NotificationThreadBlock threadBlock, Boolean preparedStatement, NpgsqlRowDescription rowDescription)<br>&nbsp; &nbsp;at Npgsql.NpgsqlCommand.GetReader(CommandBehavior cb)<br>&nbsp; &nbsp;at Npgsql.NpgsqlCommand.ExecuteNonQuery()<br>&nbsp; &nbsp;at FluentMigrator.Runner.Processors.Postgres.PostgresProcessor.Process(String sql)<br>&nbsp; &nbsp;--- End of inner exception stack trace ---<br>&nbsp; &nbsp;at FluentMigrator.Runner.Processors.Postgres.PostgresProcessor.Process(String sql)<br>&nbsp; &nbsp;at FluentMigrator.Runner.StopWatch.Time(Action action)<br>&nbsp; &nbsp;at FluentMigrator.Runner.MigrationRunner.ExecuteExpressions(ICollection1 expressions)
   at FluentMigrator.Runner.MigrationRunner.ApplyMigrationUp(IMigrationInfo migrationInfo, Boolean useTransaction)
   at FluentMigrator.Runner.MigrationRunner.MigrateUp(Boolean useAutomaticTransactionManagement)
   at FluentMigrator.Runner.Initialization.TaskExecutor.Execute()
   at Continua.Migrations.Runner.Execute(String database, String connectionString, String fileName)
   at Continua.Server.Service.Migrator.Begin()
Exiting…

Final error message is oddly:

Application is [sic] Errored

The Continua Server service seems to running but seems unresponsive from the browser.

Do I have to restore from backup? 

Hi Simon,

The error is with a database migration. Continua needs to make some changes some database columns, but the commands are timing out.

Can you access the database with the PostgreSQL admin tool (%ProgramFiles%\VSoft Technologies\ContinuaCI\Server\PostgreSQL\bin\pgAdmin3.exe)? Connect using the details in %ProgramFiles%\VSoft Technologies\ContinuaCI\Server\Continua.Server.Service.exe.config under hibernate-configuration -> session-factory -> property name=“connection.connection_string”.

Open the Query Tool and run:

 SELECT count(*) FROM builds_unittest;
 
I suspect the number is large. Normally Continua will clean up this table when it cleans up builds - it’s possible that the clean up was not working correctly in an older version (or not set). 

Before making any changes make sure you have a backup of the database. Note that the installer will have taken a backup of the original database and stored this in %ProgramData%\VSoft\ContinuaCI. 

Try running the command that failed in the query tool:

ALTER TABLE public.builds_unittest ALTER assemblyname TYPE varchar(256), ALTER assemblyname SET NOT NULL;

If this works without timing out then try restarting the Continua CI Server service. Otherwise try garbage collecting and re-indexing with the following commands, before running the above statement again:

VACUUM FULL builds_unittest;
REINDEX TABLE builds_unittest;

If the command still times out then we need to reduce the data in the table. Assuming you don’t need old unit test results, you can run the following query to remove unit tests from last year and before for example: 

DELETE FROM builds_unittest
    USING builds_build
    WHERE builds_unittest.buildid = builds_build.id
    AND builds_build.created < ‘2015-01-01’;


Alternatively we can set the timeout to a large value before running the ALTER TABLE command as follows: 

SET statement_timeout TO 30000;
ALTER TABLE public.builds_unittest ALTER assemblyname TYPE varchar(256), ALTER assemblyname SET NOT NULL

This sets it to 30 seconds - try different values until the statement works. Then restart the Continua CI Server service.

Let us know how you get on with this. 





 


Thanks for your help.

The number returned is 3806327 so I’m guessing thats high.

I tried running the SQL specified:

ALTER TABLE public.builds_unittest ALTER assemblyname TYPE varchar(256), ALTER assemblyname SET NOT NULL;

But I get this error:

ERROR: cannot alter type of a column used by a view or rule
DETAIL: rule _RETURN on view builds_unittestfixturesummary_view depends on column “assemblyname”

I’ll try running the re-index commands to see if that helps


Seems like the re-index commands did the job. The Continua Server Service has started!

Thanks, thats a relief.

P.S I could not get the ALTER TABLE command to run from the ‘pgAdmin3’ GUI even though the command looks exactly the same in the event log.

Kept on getting the same error as before.

HI Simon,

Great that you’ve got this working. I recommend checking the cleanup settings to ensure that old builds get cleaned up in the database.