r2 - 28 Apr 2005 - 13:00:24 - CrawfordCurrieYou are here: TWiki >  Codev Web > SpiderHitDetection
Tags:
, create new tag

Spider Hit Detection

It would be useful to distinguish between spider hits and user hits on a wiki. This could make statistics more valid and help understand which search engines have indexed a site.

There seems to be a few ways of doing this:

  1. change TWiki.pm's writeLog to write the agent into the $extra field (done - SVN 4099); it seems that field is never used when the user is not identified?
  2. alter TWiki/Users.pm do remember remote user block to know the IP addresses of spiders; change the registered user to be a bot accordingly.
  3. write a plugin to record hits by spiders

(3) would not help us factor out records. (2) would be dependable for the major bots but impossible for bots run by individuals and depends on a feature that many feel should disappear (because of NAT problems) and (1) requires a table to round out/munge the hundreds of http://www.psychedelix.com/agents.html into useful classifications.

Thoughts?

-- MartinCleaver - 24 Apr 2005

There is no specific proposal here, so it shouldn't be assigned to DakarRelease.

 
Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r2 < r1 | More topic actions
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback SourceForge.net Logo