SID-00771: Mediawiki to twiki conversion (mediawiki2twiki dump parse)
| Status: |
Answered |
TWiki version: |
4.3.2 |
Perl version: |
5.008008 |
| Category: |
MediaWikiToTWikiAddOn |
Server OS: |
Ubuntu 8.04.4 LTS |
Last update: |
15 years ago |
When using mediawiki2twiki plugin with
./mediawiki2twiki --file /home/dsi/dump.xml --dry --debug -max
I get the following error:
DEBUG: opening /home/dsi/dump.xml
DEBUG: would create directory /var/www/twiki/data/MediaWiki
DEBUG: would create directory /var/www/twiki/pub//MediaWiki
unable to properly parse comprehensive dump files at /var/www/twiki/lib//CPAN/lib//Parse/MediaWikiDump.pm line 490.
at /var/www/twiki/lib//CPAN/lib//Parse/MediaWikiDump.pm line 490
Parse::MediaWikiDump::Pages::parse_page('Parse::MediaWikiDump::Pages=HASH(0x62d990)', 'ARRAY(0xe0f820)') called at /var/www/twiki/lib//CPAN/lib//Parse/MediaWikiDump.pm line 72
Parse::MediaWikiDump::Pages::next('Parse::MediaWikiDump::Pages=HASH(0x62d990)') called at /var/www/twiki/lib//TWiki/Contrib/MediaWikiToTWikiAddOn/Converter.pm line 268
TWiki::Contrib::MediaWikiToTWikiAddOn::Converter::convert('TWiki::Contrib::MediaWikiToTWikiAddOn::Converter=HASH(0x1204d40)') called at /var/www/twiki/lib//TWiki/Contrib/MediaWikiToTWikiAddOn.pm line 83
TWiki::Contrib::MediaWikiToTWikiAddOn::main() called at mediawiki2twiki line 82
--
EduardoFarias - 2010-03-03
Discussion and Answer
I'm having exactly the same problem (using Fedora 12). Same line number and everything. Does it have something to do with the doubles slash (//) in some of the library paths?
--
DougLinder - 2010-04-28
I have not used this tool, and those messages are not very useful. Things to check:
1. Is /var/www/twiki/data and /var/www/twiki/pub writable by you, e.g. the user running mediawiki2twiki?
2. Possibly missing TWiki lib. Do you run mediawiki2twiki from the /var/www/twiki/tools directory?
Please report back also in case you get it to run so that other folks can learn.
--
PeterThoeny - 2010-04-28
I'm having this same problem using Fedora 6.
Just in case it can help, I'm running the script from the tools directory, and the pub directory is writable by the user that's running it.
--
DavidNotivol - 2010-05-05
Closing this question after more than 30 days of inactivity. Feel free to reopen if needed. Consider engaging one of the
TWiki consultants if you need timely help. We invite you to
get involved with the community, it is more likely you get community support if you support the open source project!
--
PeterThoeny - 2010-07-04
Looking a little closer at the source code of the breaking line number 490 in lib/CPAN/lib/Parse/MediaWikiDump.pm :
elsif ($state eq 'in_revision') {
if ($$token[0] eq '/revision') {
#If a comprehensive dump file is parsed
#it can cause uncontrolled stack growth and the
#parser only returns one revision out of
#all revisions - if we run into a
#comprehensive dump file, indicated by more
#than one <revision> section inside a <page>
#section then die with a message
#just peeking ahead, don't want to update
#the index
$token = $$buffer[$i + 1];
if ($$token[0] eq 'revision') {
die "unable to properly parse comprehensive dump files";
}
So, essentially this is a failsafe, not a bug. The problem is with the dump file. It would be helpful if the following instructions were appended to the Addon page:
When exporting a dump file from media wiki (
http://meta.wikimedia.org/wiki/MediaWiki#Database_dump
) , using the directive --full creates a "comprehensive" dump file, which means every revision of ever topic is included. This would break the import script.
Solution is, when exporting the mediawiki dump, use the --current directive instead. This exports a dump with only the latest revision of each topic.
--
OctavianDrulea - 2010-09-24
Thanks for reporting this, Octavian! I added a note to the
MediaWikiToTWikiAddOnDev page that the docs deed to be updated.
--
PeterThoeny - 2010-09-25
If you answer a question - or someone answered one of your questions - please remember to edit the page and set the status to answered. The status selector is below the edit box.