Introduction
Scenario: you are writing an application for a web site which takes textual user input and permanently stores it for later presentation on the site, in a database for example. A simple example application might be a discussion forum.
Depending on the nature of your user population you may need to consider whether there is a need to limit the possibility of offensive text being entered, and presented back, via your web site. Several options spring to mind as to how to achieve this, including:
.NET Solution
In keeping with the OO nature of ASP.NET I'll implement as a user control. If you wanted to encourage even better re-use you could implement as a custom control, composite perhaps, but this has its cons as well, not least of which is the added complexity. Look out for another article on this topic!
So, you need to register the user control with your aspx page:
changing the path to the source file (.ascx) of course to match your own set up. Next step is to instantiate the control in your web form:
There are several options for how and when to utilise the component but typically you'll have an ASP.NET web form which would include a server side button control, for example:
as well as the ASP:textbox you are using to capture textual data.
btnSubmit causes the prescribed sub on the server to be run when selected, on return to server control (assuming specified clientside validation is satisfied).
So, we now have textual information accessible via ASP.NETs server side model and we may check this within the btnSubmit sub via our user control before updating the chosen data store with the textual information. Along the lines of:
From this you can see we are accessing the CheckString function of our user control, cleanup1, submitting the suspect text for checking and returning a 'cleaned up version'. In this instance we're actually highlighting any suspect words to the user before storing the data to give them an opportunity to alter their prose.
Next, let's look at the control itself. Well, we need to specify what are naughty words obviously, and I've done this with an XML document which is loaded into an Array List data structure, triggered by the page load event:
With this implementation you can see that the ArrayList is populated with the text of all nodes with a textual nodetype so the schema is not wholly prescriptive but the simple XML file I used was based on:
With the actual words replaced to protect the innocent. You'll need to alter this.
Returning to CheckString, you can see that the implementation simply replaces any part word matches to elements in the ArrayList in the submitted text and replaces them with 4 asterisks. You might want to expand this functionality.
Finally, note that in the XML document I've used the phrase 'word root 1': important as it is only the roots of suspect vocabulary that you need to place in the XML document, thus reducing the effort involved for you. This should limit the number of XML elements you need to introduce to cover the commonly used expletives but also means care must be taken not exclude perfectly acceptable words.