Tags:
create new tag
, view all tags

AccessStatsPluginDev Discussion: Page for developer collaboration, enhancement requests, patches and improved versions on AccessStatsPlugin contributed by the TWikiCommunity.
• Please let us know what you think of this extension.
• For support, check the existing questions, or ask a new support question in the Support web!
• Please report bugs below

Feedback on AccessStatsPlugin

Please feel free to discuss implementation details and new feature in this page.

Description

Parses Apache access log file and gzipped access log files to produce statisticts.

To do

  • Replace the die by an error message upon opening the log file - DONE 21 Feb 2006
  • Add support for parsing multiple access log file i.e. zipped log file history - DONE 21 Feb 2006
  • As highlighted by Tobias below that plugin needs to be secured to prevent data mining. Moreover it could be quite demanding for servers with large log file history. This is acceptable in my opinion if we can restrict access to the statistics to specific users or groups. If unauthorized user is viewing a page using ACCESSSTATS tag we could just output an error message instead of the statistics. - OPEN
  • Currently reading and parsing of the access log files is done for each ACCESSSTATS tag on a page. Maybe there is a way to read the log file only once per page? Should we use the commondTagHandler rather than registered tag handler? - OPEN
  • As recommended by Tobias below one should be able to limit the scope of the access log search to the TWiki installation directory. A possible solution is suggested here. - DONE 27 Feb 2006

Possible new features

Could display matched lines or part of them(i.e. IP address).

-- StephaneLenclud - 18 Feb 2006

Discussion

Thanks Stephane for contributing this Plugin and sharing it with the TWikiCommunity!

I made some minor changes to the Plugin topic, feel free to roll that back in the next release.

How about measuring and documenting the PluginBenchmarks numbers?

-- PeterThoeny - 20 Feb 2006

Version 1.001 now available. I'll try to have a look at the PluginBenchmarks at some point. I would like to optimize things a bit though before benchmarking. I'd like to find a way to read access log files only once per page rendering instead of once for each tags.

-- StephaneLenclud - 21 Feb 2006

Is the regex of this Plugin limited to TWiki sites? If not, this Plugin opens possibly the door for unwanted datamining on the server. It should be configurable (outside of TWiki pages) to limited access only to the stats of TWiki sites. But then, it should be enough to parse TWikis own log files to get these informations instead of touching apaches log files.

I would wish to see some notes about security on the Plugin site before using it.

-- TobiasRoeser - 25 Feb 2006

No the regexp is not limited to TWiki site. The idea was to get hit count for attachments. Since it does not appear to be available through the TWiki web statistics topic I thought of getting that from the apache logs directly; not sure you can get it in TWiki logs, can you? You are completely right one could use the ACCESSSTATS tag for getting any kind of informations from the access log. I did not bother solving that issue because my TWiki is not open for editing at the moment. I have an action open above for securing access to the statistics. I've now edited that entry in the To Do list and added a new one. An easy way to limit the regexp to TWiki installation directory would be to disable the default parameter in the ACCESSSTATS tag. I'll publish a version in which it's easy to enable/disable usage of the default parameter through settings in the dot pm file.
Thanks for your input.

-- StephaneLenclud - 27 Feb 2006

Stephane, I've not looked at your code, so I don't know when you're parsing, or how you're making sure that you don't re-parse my 6GB logfile.

I implemented something similar for mrtg - its commited in http://svn.twiki.org/svn/twiki/trunk/tools/admin/mrtg

and it gives us the pretty graph at http://twiki.org/~sdowideit/mrtg/twiki/twiki.html showing the number of TWiki topic requests served be 5 minute block. I guess I should be packaging up an MrtgContrib...

-- SvenDowideit - 09 Apr 2007

Thanks for the pointer. That plugin is making sure it parses 6 GB of logs for each tag smile It's not doing any caching not saving any persistent data. It's the very first TWiki plugin I developed as an exercise and its implementation is very straight forward.

I just wanted to know how many download I had on certain attachments. Basically you publish a document on your site you just want to measure the public interest in that single document.

If my web site ever become very popular I'll surely implement some persistence not to parse all the log history constantly smile I'm not planning to turn that into a fully blown log parser. I was just fixing an issue I had since I moved my server from Open Suse to Ubuntu and adding it to SVN at the same time.

-- StephaneLenclud - 09 Apr 2007

Hi guys, when I tried to use the plugin I get:

TWiki detected an internal error - please check your TWiki logs and webserver logs for more information.

Can't opendir path: Permission denied

I changed the owners to root:apache and the permissions on the '/var/log/httpd and the access_log files to 654. It seems to need the execute bit set for the plugin to work.

-- PeterStephens - 12 Aug 2007

I think directories always have to have execute permissions on Linux. At least most of my directories have. But the log files themselves are -rw-r--r--. To be sure you don't have a re-occurring problem whenever a new log file is created you have to set up your log rotation system to use the desired permissions. On my Ubuntu installation for instance I had to edit /etc/logrotate.d/apache2 and fix the permission specified in there.

-- StephaneLenclud - 13 Aug 2007

Hi, Stephane:

I use %ACCESSSTATS{attachment="TeamSpace-URD.doc"}% to count my attachment file download # and it shows 3! After I download the file 3 more times (and I can see it appears in my httpd log files 6 times) but the variable still shows 3! Does anybody has the same problem?

BTW, will the plugin parse all my access log like access_log, access_log.1, access_log.2...?

-- MagicYang - 02 Jan 2008

Hi MagicYang,

Sorry for the late answer. It should indeed parse those files and the gz files too, that was the intention anyway. However you may find that this plug-in behavior was tuned for my particular apache setting and might not be parsing some of your log files.

I'm guessing your problem comes from the fact your access files are named access_log whereas mines are access.log. To fix that, open the AccessStatsPlugin.pm file and check the values for $accessLogFileName and $accessLogDirectory. If it's still not working you may want to take a look at the getAccessLogLines sub.

-- StephaneLenclud - 19 Feb 2008

MartinSeibert: Thanks for your interest in this plug-in. However I'm not planning to add screenshots on the documentation page. All it does is replacing the %ACCESSSTATS% tag with a number. For instance it can give you the number of time an attachment was downloaded.

-- StephaneLenclud - 19 Feb 2008

Stephane: Okay. Thank you.

-- MartinSeibert - 20 Feb 2008

That plug-in really needs caching now. At least my server does since it now needs to parse 2 years of logs for each %TAG%. Any Perl/TWiki API I should use for storing my cached data?

Some ideas:

  • We could cache the hit counts of compressed logs and get the final number on the fly by adding results from uncompressed logs.
  • We could use some crontab job to do caching for us but then how on hearth will the job know about the regex/attachment/topic to look for.
  • We could also possibly get some AJAX magic to parse the logs asynchronously

-- StephaneLenclud - 20 Feb 2008

I've seen the light after installing AWStats on my machine. Sooner or later I'm planning to completely modify that plug-in to get statistics from the Codev.AWStats database.

I wonder if I should just deprecate the current behavior or just implement an AWStatsPlugin? Keeping that plug-in in this current state does not make much sense as it does not scale at all. Pages just won't load on my server with only one year of logs.

-- StephaneLenclud - 16 Apr 2008

Can you tell me why this plugin reads the Apache log and not the TWiki log?

-- VickiBrown - 2009-10-20

I changed the modification policy of this plugin with to PleaseFeelFreeToModify after checking with StephaneLenclud.

-- PeterThoeny - 2011-05-03

Edit | Attach | Watch | Print version | History: r21 < r20 < r19 < r18 < r17 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r21 - 2011-05-03 - PeterThoeny
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.