VSoft Technologies Blogs

rss

VSoft Technologies Blogs - posts about our products and software development.


Delphi XE includes Regular Expression support, something that has been requested many times over the years. In this blog post I'll show some basic usage of regular expressions in delphi. I'm assuming you already understand regular expressions and the associated terminology, if not take a look here for some tutorials etc.

The regular expression engine in Delphi XE is PCRE (Perl Compatible Regular Expression). It's a fast and compliant (with generally accepted regex syntax) engine which has been around for many years. Users of earlier versions of delphi can use it with TPerlRegEx, a delphi class wrapper around it.

The XE interface to pcre is a layer of units based on contributions from various people, the pcre api header translations in RegularExpressionsAPI.pas (Florent Ouchet and co), the wrapper class TPerlRegEx (Jan Goyvaerts) in RegularExpressionsCore.pas and the record wrappers on RegularExpressions.pas (myself). This unit is based on code we currently use in FinalBuilder 6 & 7, it's well tested and has proven to be very reliable in our products.

RegularExpressions.pas is what you will use in your code. It's loosely based on the .net regex interfaces.

The main type in RegularExpressions.pas is TRegEx. TRegEx is a record with a bunch of methods and static class methods for matching with regular expressions. The static versions of the methods are provided for convenience, and should only be used for one off matches, if you are matching in a loop or repeating the same search often then you should create an 'instance' of the TRegEx record and use the non static methods.

So lets look at how we might use TRegEx to find some text in a string.

procedure FindSomething(const searchMe : string);
var
   regexpr : TRegEx;
   match   : TMatch;
   group   : TGroup;
  i           : integer;
begin
// create our regex instance, and we want to do a case insensitive search, in multiline mode

  regexpr := TRegEx.Create('^.*\b(\w+)\b\sworld',[roIgnoreCase,roMultiline]);
  match := regexpr.Match(searchMe);
  if not match.Success then
  begin
    WriteLn('No Match Found');
    exit;
  end;

  while match.Success do
  begin
    WriteLn('Match : [' + match.Value + ']';
    //group 0 is the entire match, so count will always be at least 1 for a match
    if match.Groups.Count > 1 then
    begin
      for i := 1 to match.Groups.Count -1 do
        WriteLn('     Group[' + IntToStr(i) + '] : [' + match.Value + ']';
    end;
    match := match.NextMatch;
  end;
end;

In the above example, we are trying to extract the word before the "world", and capturing that in a group. The match method will always return a TMatch, even when no match is found, so you should check the Success property of the Match to determine if a match is found. The same applies to Match.NextMatch, this makes it easy to iterate the matches. You could also call TRegEx.Matches, this returns a TMatchCollection which supports enumeration (using the for in construct), e.g :

  matches := regexpr.Matches(searchMe);
  for match in matches do
  begin
     if match.success then
     begin
     //do stuff with match here
     end;
  end;

Something to remember when working with groups is that a match's Groups collection always returns the entire match as group 0, so the groups from your expression start at 1. You will notice I don't free any of the TRegEx, TMatch or TGroups, that's because they are Records with methods rather than classes. This keeps memory management simple and helps avoid memory leaks, my original code used interfaces and reference counting but Embarcadero preferred to use records (as they have done with other new stuff introduced in recent releases).

I have created a simple example application which will help in testing regular expressions :

The source to this app can be downloaded from here

We'll expand on this app in the next post


Introducing FinalBuilder 7

It's been a long time in the making, but FinalBuilder 7 is almost here. FinalBuilder 7 has been in private beta testing for a few weeks, and today we are opening the beta to existing customers.

If you have an account on our website, and have a license for any version of FinalBuilder then you should have access to the FinalBuilder 7 Beta Forum. If you don't see it, try logging out and in again.

What's new in FinalBuilder 7?

NEW IDE

Well, as the saying goes, a picture is worth a thousand words.. so lets take a look :

FinalBuilder 7 sports a new IDE.  The FinalBuilder 7 IDE is capable of opening multiple projects (whereas FB6 only allows 1 at a time). The new IDE includes docking, something that has been requested many times over the years, so you can lay out the IDE to suite your taste. The current theme is modelled on Visual Studio, however by the time FB7 ships we should have an alternative theme available too.

There are many little enhancements and tweaks in the IDE. The most requested feature that we did implement is the renaming of variables. In FB6, when you rename a variable it does not rename references to it in the project. FB7 will attempt to find and rename all references (there may be a few cases we have missed, if you find any let us know).

Core Changes

Unicode Support

FinalBuilder is now compiled with a unicode aware compiler. The native internal string type is UTF16, and we have worked hard to ensure that all actions that work with files respect the original files encoding. This has been a massive endevour, and we are still in the process of testing actions to ensure compliance. The user interface should fully support unicode.  Note that while FinalBuilder may support unicode, many of the tools it calls do not.

Variable Types

In FinalBuilder 6 and earlier, all FinalBuilder variables are implemented as Variants. Those of you who have done some COM programming, or used Visual Basic 6 or earlier, or Delphi might be familiar with Variants. A Variant is a type that can store values of various types. Variants are very useful in many cases, but are not without their problems. For example, if you want to store a string "03" in a variant, you will have to put the double quotes around the 03, otherwise we have no idea it's not a number. To circumvent this problem, in FinalBuilder 7 we allow you to specify the variable type :

When you specify a variable type, that is how the variable will be treated, so for example if you attempt to assign a string value to an integer variable it will fail. Typed variables also allow you to specify a format string, which will be used when the variable is evaluated in an expression. For example, if you have a build number variable, that you always want to use as 4 digits, with zeros padding out the left, setting the format string to %.4d will do this.

Local Variables

In addition to the existing variable scopes (environment, application,user, project) and action list parameters, FinalBuilder 7 includes a local variable scope. The Action Group Action has a new property page that allows you to define the local variables.

These variables are only visible to child actions of the group. Since groups can be nested, this provides the ability to override variables locally, and define variables for temporary use.

Local variables cannot be persisted and cannot be used as environment variables.

 

New Actions

Of course as with every new release we added a bunch of actions:

- NDepend Action
- Git Actions
- Plastic SCM Actions
- Check If Host Exists Action
- SetupBuilder Actions
- SecureZIP Actions
- Signtool Actions
- Hyper V Actions
- Mercurial Actions
- SSH Actions
- XML Node Exists Action

And many of the existing actions had enhancements to them, support for filesets added to more actions.

What about FinalBuilder Server 7?

The FinalBuilder Server 7 beta will be ready in a day or two. It's currently being used to build FB7 (FB7 is built using FB7 and FB Server 7).

What now

FinalBuilder 7 is feature complete for the 7.0 release, there are still many more features we would like to implement in updates. Right now we are in Test/Bug fix mode, that is all the team are working on. So now it's time for some feedback.. let us know what you think, and more importantly, if you find something that isn't working right then let us know. Please keep all beta bug reports and feeback to the FinalBuilder 7 Beta Forum. If you don't see the Beta forum and you are a customer please contact support at finalbuilder dot com and we'll grant you access.

 


FinalBuilder's IDE is message driven. When you select an action in an actionlist for example, the actionlist view publishes a message to which other parts of the IDE can subscribe. The publish/subscribe mechanism used in the IDE has gone through a few revisions over the years, in the quest for the perfect publish/subscribe mechanism...

Of course when I say "perfect", what I'm really after is the architecture that is the easiest to use, and more importantly, the easiest to maintain. In earlier versions of the IDE, we used simple interfaces with an event args object that can carry various payloads :

 
type
IEventArgs = interface
  function EventID : integer;
  ..
  property IntfData : IInterface..
  property StringData : string...
  property IntData : integer..
  ...
end;
ISubscriber = interface
  procedure OnIDEEvent(const AEventArgs : IEventArgs);
end;

This has worked well for a few revisions, but with FinalBuilder 7 we have a new IDE, the number of messages being published has doubled, and code was starting to look pretty ugly. Each subcriber's OnIDEEvent method was one long case statement... and we were always having to refer back to where the message was sent from to confirm exactly what the payload was. Not at all scalable. So I started looking for new ideas. In a c# app that a colleague here wrote, he used interfaces with generics to create a publish/subscribe mechanism :

 
public interface IConsumer { }
public interface IConsumerOf : IConsumer where T : IMessage
{
  void Consume(T message);
}

Well, Delphi 2010 has generics support and I'm already making extensive use of them (hard to imagine now how I did without them!), so I figured I'd see if I could use the same technique in delphi :

 
IMessage = interface
['{BCBD228C-F184-461F-B4EE-2FC7A757C0AC}']
end;

ISubScriber = interface
['{B0D49727-272F-4D51-98B3-AA6E0708DD44}']
end;

//note the constraint so only message objects can be published
ISubScriberOf = interface(IConsumer)
//No guid for generic interfaces!
  procedure Consume(const message : T);
end;

TMyMessage = class(TIntefacedObject, IMessage)
end;
 
TMyOtherMessage = class(TIntefacedObject, IMessage)
end;
 
  
TMySubscriber = class(TIntefacedObject,ISubScriberOf, ISubScriberOf)
protected
  procedure Consume(const message : TMyMessage);overload;
  procedure Consume(const message : TMyOtherMessage);overload;
  ....
end;

Works perfectly.... except now I have a bunch of methods named Consume... and my class needs to implement an interface per message. According to my well thumbed copy of Delphi in a Nutshell, a class can implement up to 9999 interfaces so that's not a problem, but, it's just not quite as neat as I'd hoped. It worked well in the c# app as there were only a few message types. Most of my subscribers are handling 20+ messages, navigating 20+ Consume methods isn't all that maintainable.

Back to looking at the mother of all case statements in my existing architecture, it reminded me of a WndProc method and that got me thinking. How do message handlers work in the VCL? You know, this sort of thing :

 
procedure WMActivate(var Message: TWMActivate); message WM_ACTIVATE;

I had always just assumed it was compiler magic... (back to Delphi in a Nutshell again), well they are dynamic methods, which are invoked via TObject.Dispatch. This method takes a message record, and based on the message id, finds the matching method in the object's dynamic method table (which is compiler generated), if not found then it looks up the class heirachy and then eventually calls DefaultHandler if no match was found. Delphi's windows controls use this mechansim to dispatch windows messages, but there's really nothing windows specific about it and it looks like it could be used for any messages. The key is that the messages must be records because TObject.Dispatch treats them as such, and it looks at the first 4 bytes (DWoRD) as the message id. If you look in Messages.pas you will see that most messages are typically 16 bytes long, and they are packed records. In my initial tests everything worked fine just keeping the first 4 bytes for the message id, however in my IDE strange things happened (random av's). It turns out the messages really do need to be at least 16 bytes. So my message types look like this :

 
type
  TMyMessage = packed record
    MsgID : Cardinal;
    Unused : array[1..12] of Byte;
    MyPayload : whatever;
    .....
    constructor Create(APayload : whatever);
  end;

Note that I use a constructor on the record. Constructors on records are really just psuedo constructors.. but they serve a purpose here.. this is where I make sure the message gets assigned the correct message id (from a constant). Without the constructor we would have to assign it to the message before it is sent.. that opens the possibility of used the wrong id by mistake, which could cause random access violations :

 
constructor TMyMessage.Create(APayload : whatever);
begin
  MsgID := IDE_MYMESSAGE; //constant 
  MyPayLoad := APayload;
end;

One caveat with constructors on records is that they must have at least one parameter, so if your message has no payload then just use a dummy parameter. Our Subscriber interface now looks like this :

 
type
  ISubscriber = interface
    procedure Dispatch(var Message);
  end;

The Dispatch message on the interface is declared the same as TObject.Dispatch, and TObject.Dispatch is our actual implementation on our subscriber class so we don't need to do much other than declare that it implements the interface, and then provide the message handlers :

 
TMySubscriber = class(TInterfacedObject,ISubscriber,...)
protected
  procedure DoMyMessage(var Message : TMyMessage);message IDE_MYMESSAGE;
  ....
end;

Our publisher interface is quite simple too :

 
IPublisher = inteface
  procedure SendMessage(var Message);
  procedure Subscribe(const subscriber : ISubscriber; const filter : integer = 0);
  procedure UnSubscribe(const subscriber : ISubscriber);
end;

The SendMessage method takes an untype var parameter (just like Dispatch), so any message type can be passed to it. The Subscribe method has an extra parameter, filter, which allows you to specify which messages a subscriber is interested in. If it's not specified then all messages will be passed to the subscribers Dispatch method. The filter isn't strictly neccessary, but it does provide an opportunity for a small optimisation if the subscriber really only handles a few messages.

So is this the "perfect" publish/subscribe mechanism for delphi? Probably not. It's kinda neat how it's using something that's been there in TObject probably back as far as Delphi 1 (I will have to dig out my D1 source disk and have a look!). We spent a day replacing our old mechanism with this one throughout the entire FB7 IDE, and it's performing flawlessly. I've read that dynamic methods are a bit slower than non dynamic methods.. but to be honest we haven't noticed any change in performance.. what we have noticed is how much easier our code is to read and maintain.

A sample D2010 app with full source is available here.


Hybrid Version Control

Distributed verison control systems are gaining in usage and popularity, but many organisations still use traditional centralised VCSs like Subversion and Visual Source Safe. Recently I've been using a hybrid setup and getting many of the benefits of a DVCS without needing to move the whole team to a new VCS platform.
 
When I started with VSoft a few months back my first chunk of work was to create Mercurial actions for the upcoming FinalBuilder 7. It was the first time I'd used a DVCS and after the initial shock I became quite fond of it.
 
Internally we use a CVCS - Surround SCM. If you've never used Surround you can think of it as VSS done right. It uses similar concepts and abstractions but without all the pain and frustration. We're considering moving to a DVCS but haven't yet worked out if the increased flexibility is worth the extra overhead for our relatively small dev team.
 
After getting to know Mercurial though, I knew what I was missing out on. Primarily for me that was version control of my local changes. This is becoming more important as FinalBulder 7 gets closer to completion and breaking the build becomes a bigger deal. I also like being able to easily clone and sync my repositories locally as a basic backup strategy.
 
The setup I've come up with couldn't be simpler. If I'm going to work on a VS.NET solution I will:
  • check out the solution from Surround
  • hg init the solution directory to create a repository
  • hg commit -A which adds/removes all file changes to the repository
  • work locally, committing whenever I feel like it
  • when the solution is in a state where it won't break the build, check it in to Surround
 
To make things easier I've set up this alias in my Mercurial.ini:
[alias]
cam = commit -A -m
 
So to commit I just hg cam "commit message".
 
Recently I started checking my .hg directory into Surround as well. That allows me to maintain the history of all my local commits, without cluttering up the Surround check-in logs.
 

Caveats

There's some things to be aware of when using a hyrbid system like this. While you could still use Mercurial for merging work from different developers it is much more complex than in a pure DVCS setup. I'll leave it to you to work out the details: it's not something I plan on ever doing. 
 
If you're checking your .hg folder in to your CVCS you need to be careful that it doesn't become corrupted through concurrent updates/merging. Of course if it does become corrupted you don't lose much by deleting and recreating it, because the check-in history for your major changes is in the CVCS's check-in logs.
 

Other uses for Mercurial

Because Mercurial repositories are so easy to create I've started using them for all sorts of things. For example, at home I have a perl script that runs nightly and exports the contents of my Wordpress blogs to XML. Previously I included the date in the filename and ended up with (literally) hundreds of files in my backup directory. 
 
Now I've set up a Mercurial repository, removed the date component from the filename and have my backup script commit after it downloads the latest version. The directory is now a lot cleaner as well as being smaller, because Mercurial only stores the changes between each night's backup.
 

Further reading

While running a hybrid system doesn't give you all the advantages of a pure DVCS it is a major improvement over a plain CVCS. It also allows you and your team to get comfortable with the DVCS methodology before moving away from CVCS completely.
 
For more information on Mercurial, see:

  


Using interfaces and reference counting in Delphi works great for the most part. Its a feature I use a lot, I'm a big fan of using interfaces to tightly control what parts of a class a consumer has access to. But, there is one big achillies heel with reference counting in Delphi, you cannot keep circular references, at least not easily, without causing memory leaks.

Consider this trivial example :

 
IChild = interface;
IParent = interface
['{62DC70E1-8D82-4012-BF01-452EB0F7F45A}']
  procedure AddChild(const AChild : IChild);
end;

IChild = interface
['{E1DB1DA0-55D6-408E-8143-072CA433412D}']
end;

TParent = class( TInterfacedObject, IParent )
private
  FChild : IChild;
  procedure AddChild(const AChild : IChild);
public
  destructor Destroy; override;
end;

TChild = class( TInterfacedObject, IChild )
private
  FParent : IParent;
public
  constructor Create( AParent : IParent );
  destructor Destroy; override;
end;

implementation
	
constructor TChild.Create(AParent: IParent);
begin
  inherited Create;
  FParent := AParent;
  AParent.AddChild(Self);
end;

destructor TChild.Destroy;
begin
  FParent := nil;
  inherited;
end;
	
procedure TParent.AddChild(const AChild: IChild);
begin
  FChild := AChild;
end;

destructor TParent.Destroy;
begin
  if Assigned( FChild ) then
    FChild := nil;
  inherited;
end;

procedure Test;
var 
  MyParent : IParent;
  MyChild : IChild;
begin
  MyParent := TParent.Create;
  MyChild := TChild.Create(MyParent);
  MyChild := nil;
  MyParent := nil;
end;

Both parent and child are now orphaned and we have no reference to them and no way to free them! Ideally, the parent would control the life of the child, but the child would not control the life parent.

So how can we get around this? Well a technique that I have used a lot in the past is to not hold a reference to the parent in the child, but rather just a pointer to the parent.

 
TChild = class(TInterfacedObject,IChild) 
private 
  FParent : Pointer; 
  ... 
end;

constructor TChild.Create(AParent : IParent); 
begin 
	FParent := Pointer(AParent); 
end; 
function TChild.GetParent : IParent; 
begin 
  result := IParent(FParent); // if the parent has been released the we are passing out a bad reference! 
                              // a nil reference would be preferable as it's easy to check. 
end; 

This works well for the most part, but it does have the potential for access voilations if you do not understand or at least know how the child is referencing the parent.

For example :

 
var child : IChild; 
parent : IParent 
begin 
  parent := TParent.Create; 
  child := TChild.Create(parent): 
  parent := nil; //parent will now be freed, since nothing has a reference to it.
  ....... 
  parent := child.GetParent; //kaboom 
end; 

One of my collegues kindly pointed out that C# doesn't suffer from this problem and he uses circular references all the time without even thinking about it. While discussing this, he mentioned the WeakReference class in .NET. It basically allows you to hold a reference to a object without affecting it's lifecycle (ie, not influencing when it will be garbage collected). I figured there must be a way to do this in Delphi, and so set about creating a WeakReference class for Delphi.

I wasn't able to find a reliable way to do this with any old TInterfacedObject descendant, however by creating a TWeakReferencedObject class and the use of generics on Delphi 2010 I did manage to implement something that works well and is not too cumbersome. Lets take a look at our Child/Parent example using a weak reference.

The important part in this is the use of the WeakReference to the parent in the Child class. So instead of declaring

FParent : IParent;

we have

FParent : IWeakReference<IParent>;

We create it using

FParent := TWeakReference<IParent>.Create(parent); //value is an IParent instance

This is how our TChild.GetParent /SetParent methods look now :

 
function TChild.GetParent: IParent; 
begin 
  if FParent <> nil then 
    result := FParent.Data as IParent 
  else 
    result := nil; 
end; 

procedure TChild.SetParent(const value: IParent); 
begin 
  if (FParent <> nil) and FParent.IsAlive then 
    FParent.Data.RemoveChild(Self); 
  FParent := nil; 
  if value <> nil then 
    FParent := TWeakReference<IParent>.Create(value); 
end; 

Note the use of the IsAlive property on our weak reference, this tells us whether the referenced object is still available, and provides a safe way to get a concrete reference to the parent.

I still think this is something that could be solved in a better way by the delphi compiler/vcl guys n girls.

Hopefully someone will find this useful, the code is available for download here - Updated Sunday 28/3/2010

Feedback welcolme, I'm about to start making extensive use of this code, so if you see any holes then please do let me know!