Tags:
create new tag
view all tags

Question

Plucene installed succesfully with BackEnd parsers and updated CPAN libraries. Index is succesful (topics & attachments) but search doesn't return results from attachments (it seems to only search within topics). Any suggestion on how to debug it? Plucene Index log shows succesful and Apache log doesnt mention anything regarding plucene.

Environment

TWiki version: TWikiRelease04x00x05
TWiki plugins: DefaultPlugin, EmptyPlugin, InterwikiPlugin
Server OS: VM Debian Stable Linux install
Web server: Apache
Perl version:  
Client OS: Linux
Web Browser:  
Categories: Plugins, Add-Ons

-- MiloValenzuela - 17 Nov 2006

Answer

ALERT! If you answer a question - or someone answered one of your questions - please remember to edit the page and set the status to answered. The status selector is below the edit box.

As it says in the SearchEnginePluceneAddOn topic, it doesn't index in attachments.

You might consider using the SearchEngineSwishEAddOn instead.

-- CrawfordCurrie - 16 Dec 2006

Actually, SearchEnginePluceneAddOn is indexing attachments. I am not sure however how to debug your case.

-- PeterThoeny - 16 Dec 2006

Is there any way to identify "what" gets indexed? I assume that by indexing it means that it converts to some sort of text format the attachments so that they can be searched afterwards. Is there any way to check this?

-- MiloValenzuela - 21 Dec 2006

Sorry, you are right, it indexes PDF, HTML and text attachments, but not office or M$ documents, which is why never started using it.

There appears to a system in Plucene for plugging in "back end parsers" which I assume are responsible for converting the attachments to a canonical (indexable) form. Whether that is text or not.....

-- CrawfordCurrie - 22 Dec 2006

It also indexes M$ documents if you install the ExtraBackendParsers.zip parsers attached to the SearchEnginePluceneAddOnDev topic.

-- PeterThoeny - 24 Dec 2006

I did installed the plugins you mention (with their respective dependencies)...No luck...do you know where could it be the "canonical form" that CrawfordCurrie mentioned?

-- MiloValenzuela - 29 Dec 2006

The backend parsers transform a proprietary format (.doc, .pdf) into an intermediate html for indexing. Sorry, I am not that familiar with the add-on to help debug.

-- PeterThoeny - 29 Dec 2006

Change status to:
Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r8 - 2006-12-29 - PeterThoeny
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2026 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.