Tags:
create new tag
view all tags
DeleteMe petition: please keep this page for historical reasons -- DanielKabs - 14 Apr 2005

Features are implemented (SearchScopeForTopicAndText, KeywordSearchWithImplicitAnd) and content no longer relevant.

These same topics are covered in SearchDoesNotWorkAsExpected. -- ArthurClemens - 16 Aug 2003

SearchSuggestion

This topic mentions 3 improvements to TWiki search performance and usage. One solution is offered, the other 2 are still pending AFAIK.



Problem

Context Finding and accessing stored information is a substantial part of Twiki. The "search" element provides that functionality. In my opinion its implementation is rather poor and impairs Twikis usability.

Example Try for yourself: Enter "search text". If you have javascript enabled the words are automagically filled in. The plain text search (which can be included on any page by cutting&pasting a bunch of strange HTML code)

should (at least) return this page as a result. But it doesn't.

Problems The Twiki search does not search in the topic names but only in the topic bodies. Worse, it does search for the literal occurence of the search items. This is contrary to the usual syntax of popular search engines, which would be to search for documents which contain all keywords. Usually, one uses quotation marks to indicate a phrase search.

Solution I therefore recommend:

  1. Plain text search should search both topic names and bodies.
  2. Any text search should search for every word entered and not for the exact phrase, as long as "search regular expression" is unchecked, i.e. text search should mimic the standard behaviour one is used from popular search engines.
  3. document the /cgi-bin/search/Codev/ CGI-script and its features: How do I use it on my own pages.

Related Issues about searching TWiki :

Update: -- DanielKabs - 12 Feb 2003, MattWilkie - 06 Feb 2003

Discussion

Regarding Plain text search should search both topic names and bodies. Could be done with a new scope="all" switch (the existing ones are scope="topic" and scope="text"

Other search enhancements are described in the topics you found.

The search script parameters are documented in TWikiVariables, they are identical to the %SEARCH{...}% variable.

-- PeterThoeny - 24 Feb 2001

The listed suggestions date from last year. I wonder why your suggestion of implementing the altavista-like search style hasnīt managed to get into the FeatureToDo TopicClassification. What is the recent status in the search feature debate?

In my opinion a powerful search function is as important as a search syntax that is familiar to users. It is hard enough get people to use Wiki, why not ease acceptance by offering an altavista like search syntax?

Cheers Daniel


Moved over from: NativeVersionFormat [ EdgarBrown - 24 Mar 2001 ]

If you make a switch to DBM, please consider how an external search engine can index / search the content.

I plan to add an external search engine to my twiki site (under construction) that would search and index the plain text files (.txt) to provide more extensive searching capability than currently provided by twiki. (Proximity searches, for example, <word_1> w/10 <word_2> (word_2 within 10 words of word 1), etc.) _Struggling (slightly) now with the order of the previous — should it be (do I prefer) "word_1 within 10 words of word _2" (the order doesn't really matter here for w/10, but if we define a within n words before (or after) option then we do need the order in the most intuitive way. -rhk

When I find the right search engine (Alta Vista personal, htdig, zyindex, google, ???), this will be easy as long as the content is stored in plain text files. As a Linux / twiki / Perl newbie, I am not sure whether this approach can be made workable if the content is stored in a database.

Aside: Why am I planning to search the "raw" content in the .txt files instead of the "cooked" html? I'm not entirely sure. I suspect some of those search engines may be able to better deal with real plain text files rather than dynamically created HTML files. (And, I suspect some might have a problem with the "dynamic" files and others might have trouble with the HTML.) And, the plain text files don't have to struggle to ignore the HTML tags — ignoring the HTML tags would (usually?) be the preferred behavior. -rhk

(Of course, if DBM is just an option, and I still have the ability to choose to store the content in .txt files, I can stick with that option. Still, if the DBM format provides other advantages, it would be nice to use it.)

-- RandyKramer - 24 Mar 2001

As I understand, most search engines are nothing more than a ton of perl code that generates the indexes (normally a very time-consuming process), and then functions to use that index to generate search results.

In the case of the TWiki, we might want to use some of that Perl code, but modify it to suit our needs, for example: most of the information on the twiki is static, but when a page is edited, we need to re-do the indexes, otherwise the searches would not work for recently modified pages. That means that we have to use some sort of continuous indexing, where whenever a page is edited the index information is updated to reflect the new file.

Thus at a bare minimum, we should be able to remove the previous page contents from the index, and add the new page contents in the index, without re-triggering a full index operation (a procedure that should be simplified by twiki's reliance on RCS (or CVS), something that I guess most search engines won't do.

-- EdgarBrown - 24 Mar 2001


Solution 1: Plain text search should search both topic names and bodies


Solution 2: Search should search for every word entered and not for the exact phrase

Technically this is solved with RegularExpression search (semicolon ';' for and) , but this is not very user friendly.


Solution 3: Document the CGI-script and its features

Pending...


Category: TWikiPatches
search test blurb: I recommend to use only small amounts of soap for cleaning the hands. Do you know if wsdl is a web service still in use?
Edit | Attach | Watch | Print version | History: r27 < r26 < r25 < r24 < r23 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r27 - 2005-04-14 - DanielKabs
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.