Remove White Space from Text
ASPAlliance.com: The #1 ASP.NET Community
The ASPSmith
Search
D: | Domains | Authors.aspalliance.com | Stevesmith | Articles | Remove White Space from Text
Remove White Space from Text

By Steven Smith

[Example/ClassicASP]
[Example/VB.Net]
[Example/C#]

In the course of improving this website's search engine, I wrote a routine that would extract the text from an article given a URL, strip out the HTML, and then convert all of the white space and carriage returns into single spaces. This was done to compress the size of the text involved, which was then stored in the database and used for full-text searches. In order to eliminate all whitespace from a string, including newline characters, and replace it all with single spaces, I used regular expressions (with some help from Remas).

My code was written using ASP and VBScript (version 5.5 for RegExp support), but I'll show how it can easily be done in ASP.NET. For a very quick intro to RegExp in ASP.NET, see my previous article, Replace In ASP.NET.

First, let look at the source code of the ASP function:

Function RemoveWhiteSpace(strText)
 Dim RegEx
 Set RegEx = New RegExp
 RegEx.Pattern = "\s+"
 RegEx.Multiline = True
 RegEx.Global = True
 strText = RegEx.Replace(strText, " ")
 RemoveWhiteSpace = strText
End Function

Ok, now let's see how it would be done in ASP.NET. Just to make this article more interesting, I'll list the code in all three standard languages of .NET: VB, C#, and JScript.

   <%@ Import Namespace="System.IO" %>
   <%@ Import Namespace="System.Text.RegularExpressions" %>
   <script language="VB" runat="server">
   
   Sub SubmitBtn_Click(sender As Object, e As EventArgs)
    Dim strInput As String
    Dim strOutput As String
    strInput = Text1.Text
    strOutput = Regex.Replace(strInput, "\s+", " ")
10    output.Text = strOutput
11   End Sub
12   
13   </script>
14   <html>
15   <body>
16   <a href="/stevesmith/articles/removewhitespace.asp">Return To Article</a>
17   <form runat="server">
18   <table width="100%">
19   <tr>
20    <td valign="top">Add Text, including line breaks, etc.</td>
21    <td valign="top">Text, in PRE tags, without whitespace (may scroll right a long way)</td>
22   </tr>
23   <tr>
24    <td valign="top">
25    <asp:TextBox TextMode="multiline" id="Text1" width="200px"
26    height="80px" runat="server" />
27    <asp:Button OnClick="SubmitBtn_Click" Text="Format Text" Runat="server"/>
28    </td>
29    <td valign="top"
30    <pre><asp:label id="output" runat="server" /></pre>
31    </td>
32   </tr>
33   </table>
34   </form>
35   </body>
36   </html>
C# VB JScript

The full source of the example is shown. You can run the example and see how it works.

Other useful links on regular expressions:





ASP.NET Developer's Cookbook, By Steven Smith, Rob Howard, ASPAlliance.com 

ASP.NET By Example, By Steven Smith 




Steven Smith, MCSE + Internet (4.0)
Last Modified: 6/12/2009 10:58:23 AM
History: 6/12/2009 10:58:23 AM