ASP Alliance Home


Do the impossible with EasyListBox



More Listbox Resources



Visit ASPDLL.com to find the right component for the job!



Browscap.ini at asp.net.do


Articles:
A Brief Introduction
Is It Human?
Please Wait...
You Can't Do That


Sample Apps:
Chat Room
File Browser




Listed with:
ASP Resource Index
fuzzysoftware.com
 
 
Need a top-notch web developer? Hire ME!


Is It Human?
... or Creative use of Server Variables

    Have you ever wondered how many of those clicks on your hit counter are just spiders munching everything in sight? Want to know how to catch the "real people"? For example, I recently built an all-graphics site that wouldn't register with the search engines because there was no body text. As a workaround, I built a secondary page which displayed relevant keywords to robots and redirected users to the real site, with the help of a fairly simple function.

    The function is called isHuman, and it returns a boolean value: true for human, false for robot. Pretty simple, if you think about it... but how in the world do you pull off the mask? If you haven't met before, I'd like to introduce you to the Server Variables collection.

    Many of you already know about these variables, and have used them for everything from security to personalization. Take notice of the one called HTTP_USER_AGENT; it contains a string which identifies the browser name, version, and operating system. You may also notice, if you're using a Microsoft browser, that the word "Mozilla" (a Netscape trademark word) heads up the string. This dates back to the early IE days when a new-to-the-web MS was trying to make its browser accepted, and the best way to tell the server was to say "It's just like Netscape"... but that's for another story.

    Your user agent is CCBot/1.0 (+http://www.commoncrawl.org/bot.html)

    So we can see the browser type. How can we use that? Well, as it turns out, there are certain key words that are unique to "human" browsers, and likewise for the robots. The following function, called isHuman, returns a boolean (true/false) value based on three tests for these keywords.


Function isHuman()

   Dim strBrowser, strAccepted, strRejected, strCrawlers
   Dim arrAccepted, arrRejected, arrCrawlers
   Dim strHumanCookie
   Dim intCount
   Dim booIsHuman
   Dim strRefresh
		

    First we set booIsHuman to determine the default value of the function (True for unknown browsers). If you wish, you can change this initialization or eliminate it and trap a type mismatch error if the test is inconclusive (you could also make it a string and return whatever you want, but this way seems cleaner to me).

   booIsHuman = True
		

    If the function has been run already during this visit AND if the browser supports cookies, we can save time and processor cycles by finding the isHuman cookie and bypassing the rest of the function.

   strHumanCookie = Request.Cookies("isHuman")

   If strHumanCookie = "Y" Then
      booIsHuman = True
   ElseIf strHumanCookie = "N" Then
      booIsHuman = False
   Else
		

    Next we retrieve the user agent string. Creating the ServerVariables collection has considerable overhead, so we'll do it only if the user isn't already labeled. After that, we'll build the criteria strings for our three tests. FYI, the criteria that ship with this function are derived from Juan Llibre's latest version of browscap.ini, downloaded from asp.net.do.

      strBrowser = UCase(Request.ServerVariables("HTTP_USER_AGENT"))

      ' This is a list of keywords that will be found in "human" browsers.
      strAccepted = "Mozilla|PRODIGY|NaviPress|Lynx|libwww|amaya|iCab|Cyberdog|Mosaic"
      strAccepted = strAccepted & "|O'Reilly|HotJava|Java1|JDK|Nokia|Amiga|IBrowse"
      strAccepted = UCase(strAccepted)
	  
      ' This is a list of keywords that will be found in robot browsers.
      strRejected = "Powermarks|BorderManager|NetMind|EZResult|WebWhacker"
	  strRejected = strRejected & "|Robot|Crawler"
	  strRejected = UCase(strRejected)
	  
      ' This is a list of user agent strings for known robot browsers.
      strCrawlers = "[Mozilla 4.0]|[Mozilla/4.x (Win95)]"
	  strCrawlers = strCrawlers & "|[Mozilla/5.0 (compatible; MSIE 5.0)]"
	  strCrawlers = UCase(strCrawlers)

      arrAccepted = Split(strAccepted,"|")
      arrRejected = Split(strRejected,"|")
      arrCrawlers = Split(strCrawlers,"|")
		

    Test 1: check to see if the entire string matches that of a known spider/bot; this is only for the few that do not contain a specialized bot keyword. This test will automatically disqualify the agents in arrCrawlers.

      For intCount = 0 To UBound(arrCrawlers)
         If strBrowser = arrCrawlers(intCount) Then
            booIsHuman = False
            Exit For
         End If
      Next
		

    Test 2: check for the presence of an "accepted browser" keyword; this will let through a few robot agents which will be picked up by the final test.


      If booIsHuman = True Then
         For intCount = 0 To UBound(arrAccepted)
            If InStr(strBrowser,arrAccepted(intCount)) > 0 Then
               booIsHuman = True
               Exit For
            End If
         Next
      End If
		

    Test 3: check for the presence of a "rejected bot" keyword; any bots that make it through the Accepted test will be screened out here.

      If booIsHuman = True Then
         For intCount = 0 To UBound(arrRejected)
            If InStr(strBrowser,arrRejected(intCount)) > 0 Then
               booIsHuman = False
               Exit For
            End If
         Next
      End If
		

    We're almost done! Set the cookie to label the user for next time. Remember to use Response.Buffer = True, or your application will yield the "The HTTP headers are already written to the client browser" error.

      If booIsHuman = True Then
         Response.Cookies("isHuman") = "Y" 
      ElseIf booIsHuman = False Then
         Response.Cookies("isHuman") = "N"
      Else
         booIsHuman = "Unknown Browser"
      End If
   
   End If
		

    Pass the value out the door, and the suspense is over; You'll be relieved to know that you've been declared human.

   isHuman = booIsHuman

End Function
         

    Download the entire documented function for your own use; add it to your library or include it solo.

peterbrunone@aspalliance.com


 


Still fighting that <select> menu?   EasyListBox.com can help.