|
ASP Kitchen: ASPWatch.com articles: VBScript Regular Expressions VBScript Regular ExpressionsIntroductionFrom a programming perspective, one of the biggest annoyances I have faced with VBScript is the lack of decent string handling functions. Sure, with a bit of imagination, functions such as InStr, Left and Mid can be combined to help with the job at hand. But for anyone used to the Perl programming language, VBScript can appear to be extremely inflexible. VBScript Regular ExpressionsWhile still being someway off the effectiveness of Perl, version 5 of VBScript now has much improved text handling functions, through it's support for Regular Expressions. For anyone who hasnt encountered the term, Regular Expressions have been an essential part of that nasty operating system known as Unix for many years (and where considered so useful that they were incorporated into Perl). They can be cryptic and difficult to learn, but they allow sophisticated pattern matching in strings. Regular Expressions are quite involved (it is possible to buy whole books devoted to their use!), and the purpose of this article is simply to draw your attention to the potential of using them within VBScript. If you want to know more, then try some of the references at the bottom of the article. Using the VBScript RegEx object Support for Regular Expressions has been included in VBScript by the inclusion of the RegExp object in VBScript version 5. This is the version of the VBScript scripting engine released with Internet Explorer 5. If you dont have IE5 on your server, but want to upgrade the scripting engine, then download the new scripting engine from www.microsoft.com/scripting. If you dont know which version of the scripting engine you are using, then try using the following small script: <% This should display a message such as the one below:
Using the RegExp objectOnce you have determined that your web server is running the correct version of VBScript, the following lines of code will demonstrate the use of the RegExp object. The first line of code will create a new string that will be searched for the existence of a sub-string: StringToSearch = "http://www.foo.com" The RegExp object can then be created: Set RegularExpressionObject = New RegExp This object has three properties: Pattern, IgnoreCase and Global. Pattern specifies the Regular Expression that should be searched for. IgnoreCase should be True or False depending on whether the search should be case sensitive (the default is true). Finally, the Global property should be set to True if the search should match all occurrences of the pattern, or False if just the first occurrence should be matched: With RegularExpressionObject Once the RegExp objects properties have been set, it is time to test the Regular Expression. This is done using something like the line below. This uses the Test method of the RegExp object to see if the Regular Expression is found in the StringToSearch string. expressionmatch = RegularExpressionObject.Test(StringToSearch) The Test method will return True if the Regular Expression was found, and False if it was not found. If expressionmatch Then Finally, the RegExp object is destroyed since it is no longer required. Set RegularExpressionObject = nothing More RegExp object methodsAs well as the Test method, the VBScript RegExp object has two further methods: Execute and Replace. The RegExp Execute methodThe RegExp Execute method is a more sophisticated version of the Test method. As well as seeing if the Regular Expression is found within a string, it will also return the number of matches made within that string, and at which position in the string the match(es) were made. An example of its use is below: <% As with the Test method, the RegExps Global and IgnoreCase properties are useful. The RegExp Replace methodThis can be used to replace a part of a string using Regular Expression matching. For example, in the script below, .co.uk is changed into .com: <% Real life Regular ExpressionsSo far, this article has shown how to use the VBScript RegExp object to manipulate and test strings, but there is nothing here that couldnt already be done with other VBScript functions. The power of Regular Expressions only become apparent when more complex situations are encountered. For example, the VBScript function below will strip out all the HTML tags from strings: <% The function can then be called using something like: <% The function works because it replaces HTML tags with a null character. HTML tags are
identified using the Regular Expression held in the Pattern property. This is a sequence
of special characters. This means that a HTML tag should start with a "<". It
should then contain one or more characters except for a greater than sign
">". This is indicated by enclosing the greater than sign in square brackets,
and using the plus sign (which means match the preceding character one or more times. The
^ symbol denotes that the character should NOT appear. Finally, it should contain a
greater than sign to close the HTML tag. .Pattern = ".co.uk$" Use a bar to specify that several expressions should be matched. The following will match .co.uk or .com at the end of a string: .Pattern = ".co.uk$|.com$" Using Regular Expressions in Visual BasicFew people seem to know that Regular Expressions can also be used with Visual Basic 5 or 6. All you need to do is to include a reference to "Microsoft VBScript Regular Expressions 1.0" in your project. The following Visual Basic sample code will replace some HTML in a TextBox called Text1 with plain text: Dim RegularExpressionObject As New VBScript_RegExp_10.RegExp Note that more recent versions of the VBScript Regular Expressions library now exist, so it may also be possible to make a reference to "Microsoft VBScript Regular Expressions 5.5". If this library is used then the first line of the Visual Basic code will have to be modified to: Dim RegularExpressionObject As New VBScript_RegExp_55.RegExp Further Reading
Useful Development Tools
Author detailsBrett Burridge has worked as a web developer since 1997 and has developed web applications for a range of corporations, start up busiensses and educational establishments. Brett is presently employed as an Internet developer and technical writer through his own company, Winnersh Triangle Web Solutions Limited. The company produces a number of innovative products, including a range of software documentation tools, which include the ASP Documentation Tool, the .NET Documentation Tool for VB.NET and C#, and the SQL Server Documentation Tool. Other products include The Website Utility, which functions as a website error checker, search engine optimizer and ASP/ASP.NET search engine builder application. As well as the ASPAlliance, Brett has written articles for Ariadne.ac.uk, ASPToday, the software documentation portal www.softwaredocumentation.info, and has contributed recipes to the ASP.NET Developer's Cookbook. links Outside web development, Brett is interested in travelling (here are my travel logs from New York, Hong Kong and Tokyo), digital photography (here's my photo gallery), tropical fishkeeping and collecting contemporary works of art by artists such as Doug Hyde. Contact Brett by emailing Article history"VBScript regular expressions" originally published on ASPWatch.com on November 10 1999. Republished on ASPAlliance.com on 28 September 2001. ASP Kitchen: ASPWatch.com articles: VBScript Regular Expressions |
|
|||||||||||||||||||||||||||
| © page content copyright Brett Burridge 1998 - 2009. | ||||||||||||||||||||||||||||