Tags:
create new tag
view all tags
I've been wondering about more capable searching e.g. auto generating a search for FormTemplateSystem, which would require AND capability.

In an idle moment I changed Search.pm so it could search using pure Perl (with topics fully read by Store routine). Conclusions and thoughts:

  • Code change was very simple
  • On my home test TWiki installation (Windows 2000) the performance was not noticably different to using grep. Only big Web was TWiki with both seaches taking 2-3 seconds.
  • Was trivial to then add AND searching and search that covers both topic name and topic content.
  • Perhaps this isn't too suprising as matching in Perl is well optimised.
  • This implementation could be made faster by matching each line as it's read and abandoning file read once search matches, but this will only help were there are many topics that match.

I've coded so that if egrep/fgrep value from TWiki.cfg is false, then switches to Perl based search.

Shall I add to the core? If so and everyone finds it as fast as grep then we could drop the grep dependency.

-- JohnTalintyre - 23 Jun 2001

Would be excellent if there is no noticable performance hit! What kind of commands to you specify for AND search?

Could you do some timing with writeDebugTimes? Keep in mind that the biggest performance hit with the current search is not grep (one system call), but the external rcs calls (hundreds of calls); and those will go away once we get it from the topic meta data. It would be interesting to know the timings of the grep search vs internal Perl search with rcs calls disabled (for tests change Store routine to return the same dummy default values for version and author).

In case there is no noticable performance hit we can put it into the core. In that case we should do the search in TWiki::Store so that later on a different back-end can be created without impacting the rest of the code. The search function in STore should return a list of hits with the topic info (topic name, timestamp, version, author) so that the file needs to be opened only once.

-- PeterThoeny - 23 Jun 2001

I'll try and do some timings over the next coule of days. With version and author in meta tags there's no RCS hit (I think). An advantage of upgrading all topics to meta format, rather than waiting for individual saves. For AND I simply looped around a match //, breaking out if there was a match.

-- JohnTalintyre - 24 Jun 2001

Oh well, had to be too good to be true. On a Solaris box I found grep about 10 times faster than perl for a large Web. So only plus point is easier to change functionality and could be useful as an option to save people getting grep working on Windows.

-- JohnTalintyre - 27 Jun 2001

  1. I´ve never had a problem getting Grep to work on Windows smile
  2. I might really want AND type functionality and when I do so I might be willing to wait.

How about making it part of the AdvancedSearch capability?

-- MartinCleaver - 27 Jun 2001

Even though I still want to take a stab at an AdvancedSearch implementation, and I'm convinced that it should probably be the next milestone after the TWikiReleaseSpring2001 is out (but I have no time to put into that in the near future), here are my two cents.

An easy way to have the AND / OR capability even for this up-coming release, is to use repeated greps on each of the ANDded terms. If the performance hit wrt Perl is as you say, that would reduce it by a factor of 5 for most searches.

-- EdgarBrown - 27 Jun 2001

The AND searching is now implemented as a separate Perl script called andgrep, easily used with any TWiki version. See CategorySearchForm for more.

-- RichardDonkin - 17 Jul 2001

TopicClassification:
FeatureBrainstorming
Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r7 - 2001-07-17 - RichardDonkin
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2026 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.