Question
Plucene installed succesfully with BackEnd parsers and updated CPAN libraries. Index is succesful (topics & attachments) but search doesn't return results from attachments (it seems to only search within topics). Any suggestion on how to debug it? Plucene Index log shows succesful and Apache log doesnt mention anything regarding plucene.
Environment
--
MiloValenzuela - 17 Nov 2006
Answer
If you answer a question - or someone answered one of your questions - please remember to edit the page and set the status to answered. The status selector is below the edit box.
As it says in the
SearchEnginePluceneAddOn topic, it doesn't index in attachments.
You might consider using the
SearchEngineSwishEAddOn instead.
--
CrawfordCurrie - 16 Dec 2006
Actually,
SearchEnginePluceneAddOn is indexing attachments. I am not sure however how to debug your case.
--
PeterThoeny - 16 Dec 2006
Is there any way to identify "what" gets indexed? I assume that by indexing it means that it converts to some sort of text format the attachments so that they can be searched afterwards. Is there any way to check this?
--
MiloValenzuela - 21 Dec 2006
Sorry, you are right, it indexes PDF, HTML and text attachments, but not office or M$ documents, which is why never started using it.
There appears to a system in Plucene for plugging in "back end parsers" which I assume are responsible for converting the attachments to a canonical (indexable) form. Whether that is text or not.....
--
CrawfordCurrie - 22 Dec 2006
It also indexes M$ documents if you install the
ExtraBackendParsers.zip parsers attached to the
SearchEnginePluceneAddOnDev topic.
--
PeterThoeny - 24 Dec 2006
I did installed the plugins you mention (with their respective dependencies)...No luck...do you know where could it be the "canonical form" that
CrawfordCurrie mentioned?
--
MiloValenzuela - 29 Dec 2006
The backend parsers transform a proprietary format (.doc, .pdf) into an intermediate html for indexing. Sorry, I am not that familiar with the add-on to help debug.
--
PeterThoeny - 29 Dec 2006