Tags:
create new tag
, view all tags

TWiki Formatted Search in Topics

Abstract

%SEARCH is extended to permit formatting and displaying of each location in a topic which meets the specified search criterea.

Overview

Normally the results of a FormattedSearch summarizes each topic meeting the search criterea. With the format="..." parameter there is considerable flexibility in how the search topic information can be presented. With the below patch to TWiki, the %SEARCH is extended to permit formatting and displaying of each location in a topic which meets the specified search criterea. This is done by adding a hitformat parameter to the %SEARCH, and generalizing the format string in an upward-compatible manner.

Syntax

To indicate that a FormattedSearchinTopics is desired, a hitformat="..." parameter is specified in the %SEARCH{...}%. Whenever a hit is found in a topic, the string specified by hitformat is used to format the hit, with the string "$hit" in the string being replaced by the indicated text in the hit. (See the discussion below on $pattern(...) for how the "indicated text" is specified.)

format="..." is used similar to a normal FormattedSearch, except that the $pattern(...) variable is mandatory and used to specify exactly what in a topic will constitute a hit. Unlike a normal FormattedSearch, the character following the $pattern need not be a "("; it may be any character and constitutes the starting delimiter. The ending delimiter to the $pattern specification will be the same character, except '(' will have an ending delimiter ')', '<' will have '>', '{' will have '}', and '[' will have ']'. Any use of the starting or ending delimiters in the actual pattern must be preceded by '\'.

This generalization of delimiters is available because, as discussed below, every pattern string must indicate the hit by use of (...) within the string, and it gets untidy adding '\' everywhere. A wise choice of delimiters allows easier specification of a pattern string without excessive '\'s.

Examples of pattern strings would be

      $pattern(abc\(def\)ghi)
      $patternxabc(def)ghix
      $pattern@abc(def)ghi@
      $pattern<abc(def)ghi>

The pattern string is a perl RegularExpression, in which each nested (...) (or \(...\) if '(' and ')' were chosen for delimiters) specifies what will be substituted for $hit in the format="..." string if the entire pattern string matches some text in the topic. This permits only a portion of a hit to be selected. It should be noted that the search within a topic is automatically case-insensitive.

To prevent display of the actual %SEARCH{...}% string in the search results, hits with the string "hitformat=" in them are ignored, and hits with "%SEARCH" in them will not cause a second %SEARCH to be performed.

Designing a FormattedSearchinTopics

There are several processing steps in a FormattedSearchinTopics, and different information is available at each step for output as the results:

  • The webs specified by the web='...' are searched for topics matching the "text", search="text", and topic="..." parameters. It is these parameters (as modified by regex="..." parameters, etc.) which determine what topics will be inspected for hits. In essence, this is a first-level quick search to limit the topics which will be searched more carefully for hits.
  • For these topics, the hitformat and format parameter strings are used to format the output as follows:
    • The hitformat string is divided into three parts: The prehit text before the "$hit" string, the "$hit" string, and the posthit text after the "$hit" string.
    • The format string is divided also into three parts: the prepattern string before the $pattern(...), the pattern string in the $pattern(...), and the postpattern string after the $pattern(...).
    • The pattern string has various hit strings indicated by (...).
  • Taking liberties with new lines, the output of a successful FormattedSearchinTopics will be presented as follows:
       Optional header string as specified by the header="..." parameter
       prepattern string for topic 1
          prehit string
          hit #1 in topic 1
          posthit string
          prehit string
          hit #2 in topic 1
          posthit string
          ...
       postpattern string for topic 1
       prepattern string for topic 2
          prehit string
          hit #1 in topic 2
          posthit string
       ...

Example

If we want to display the paragraphs whereever "GPL" is mentioned, a search like:

%SEARCH{ "GPL" hitformat="   * $hit<br>" scope="text" regex="on" nosearch="on" nototal="on" header="*Web: $web*" format="<br>Topic: [[$topic]]<br>$pattern(\([^\n\r]*GPL[^\n\r]*\))<br>"}%

would generate output in a virgin Feb2003 release of TWiki like:

Web: TWiki
Topic: FormattedSearchinTopics
* If we want to display the paragraphs whereever "GPL" is mentioned, a search like:


Topic: GnuGeneralPublicLicense
* TWiki has a GPL (GNU General Public License). What is GPL?
* TWiki is distributed under the GNU General Public License, see TWikiDownload. GPL is one of the free software licenses that protects the copyright holder, and at the same time allows users to redistribute the software under the terms of the license. Extract:
* * See the GNU General Public License for more details, published at http://www.gnu.ai.mit.edu/copyleft/gpl.html
* Please note that TWiki is not distributed under the LGPL (Lesser General Public Licence), which implies TWiki can only be used with software that is licensed under conditions compliant with the GPL. Embedding in proprietary software requires an alternative license. Contact the author for details.


Topic: TWikiFuncModule
* http://www.gnu.org/copyleft/gpl.html


Topic: TWikiSite
* * TWiki is developed as Free Software under the GNU/GPL


Topic: WebHome
* * TWiki is developed as Free Software under the GNU/GPL

Related Discussions

-- HarryFelder - 07 Mar 2003

Harry, this is a well defined spec and implementation. It is also powerful and flexible because you can specify a pre-pattern for each topic.

Nevertheless, the MultipleSearchesInSameTopic is easier to understand and use. This is what is now in the core code. The topic="" parameter is pending, see SearchTopicNameAndTopicText.

-- PeterThoeny - 29 Sep 2003

Harry, I tried to install your patch but it failed. The patch wants to change this file: lib/TWiki/Search.pm~        Sat Jan  4 20:36:46 2003 But the installation file is actually: Jan  5  2003 lib/TWiki/Search.pm

Any chance of fixing this? I don't want to wait for the next version of TWiki to appear and I find your spec. easy to understand...

Peter, why not put both implementations into the core?

-- SimonHardyFrancis - 15 Jul 2004

Category: TWikiPatches

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r5 - 2005-09-30 - WillNorris
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.