Tags:
cruft1Add my vote for this tag delete_me1Add my vote for this tag export2Add my vote for this tag pdf2Add my vote for this tag plugin1Add my vote for this tag create new tag
, view all tags
Good printing is an important requirement for us and I guess for others - think TWikiForBookAuthoring, DocBook.

I came up with a solution using HTMLDOC, a GPLed tool from http://www.easysw.com/htmldoc/ The idea is simple:

  1. The view script pipes it's output into the htmldoc converter tool; I added the option pdf=on
  2. You will most likely want an extra skin, like view.pdf.tmpl
  3. The htmldoc call and options come from the template pdfcall.tmpl. The absolute minimum ist sth. like /usr/bin/htmldoc -t pdf --book %PDFBODY% See http://www.easysw.com/htmldoc/htmldoc.pdf for all the options available.
  4. The optional titlepage layout comes from pdftitle.tmpl. A temporary file %PDFTITLE% holds the HTML output needed for the htmldoc call.

See the quick'n dirty implementation based on TWike20011201.zip in the attachments. (Note, that I renamed the stuff from my previous 20001201 based implementation to avoid confusion with the "print" skin.) To get an impression, check the PDF printout of the TWiki documentation set. TWikiDocumentation.pdf was built using .../twiki/view/TWiki/TWikiDocumentation?pdf=on&skin=pdf

Open Issues:

  1. Search paths for multiple images
  2. Tmp file mess
  3. Security?
  4. Implementation style?

-- PeterKlausner - 11 July 2002

The problem is a wider one that needs an architectural solution to prevent these hacks getting messy. I think we need to add a Filters mechanism the purpose of which is to convert from WikiML into other formats. In this case, you go WikiML -> HTML -> PDF, but this should not always be the case.

-- MartinCleaver - 12 Jul 2002

Zope has something similar to this in its BackTalk "product" (FYI: Zope product = plugin). That environment, written in Python, uses ReportLab Toolkit software library (also written in Python) to convert structured text into PDF. Interestingly, it goes directly from the WikiML (or a dialect version of it at least) directly into an intermediate form (XML?) processed by the Toolkit library ultimately into PDF. The Toolkit library is open source (GPL-ish?) but I do not want to even ponder what it would take to re-implement something similar in Perl. It may be possible to leverage it as-is (at the expense of requiring Python for PrintUsingPDF on Twiki). More information on ReportLab Toolkit can be found at http://www.reportlab.org/rl_toolkit.html

Of course this does not solve the problem of having an intermediate file. But it does point at the idea of providing a more flexible and universal intermediate file. This architectural paradigm separates content from presentation. The "view" script would then become one "pretty printer" to render HTML. A different "print" script (or perhaps a view?out=pdf) could be used to render PDF or any other format (e.g. LaTex, etc.) The nice thing about this is that if a user does not like the default "view?out=html" pretty printer script, a new one can be made relatively easily (e.g. view?out=myformat) without modifying the default pretty printer. I could see this being useful for allowing completely customized rendering of web content (a.k.a. skin?)....

-- PeterSanza - 31 Dec 2002

Here is another one based on HTMLDOC:

  • Install htmldoc on your twiki machine.
  • Copy the following script >>pdf<< in your twiki/bin directory and adjust the three paths specified at the scripts top.
  • Now you can view every topic wthin twiki by substituting the "view" with "pdf" in the URL.

I know the script looks terrible. I am not a Perl guy, so I don't know how to do better. It's just a dirty hack. But the "temp file mess" is solved and there are a couple of parameters you pass along. These are (currently):

Parameter Values Description
format {ps1,ps2,ps3,pdf11,pdf12,pdf13,pdf14,html} htmldoc '-t' parameter. Specifies the output format. Default is 'pdf14'.
linkstyle plain, underline Defines if links shall be rendered underlined or not.
toclevels [numeric] Number of hierarchy levels to include in table of contents. Use toc-levels=0 to suppress table of contents generation.
firstpage p1,toc,c1 Specifies which page shall be displayed initially (either first page including content (p1), the table of contents (toc) or the page including the first chapter's header (c1))
size letter,a4,WxH{in,cm,mm},etc Page size to be generated. (defaults to a4)
bodycolor [html color code] Background color of the document.
browserwidth [numeric] This is a very good parameter. It allows scaling all images. It is not a percentage value, though.
skin [installed skin] You favorite pdf skin. Default to plain
footer fff Formatting of the footer.
header fff Formatting of the header.
tocfooter fff Formatting of the footer within the TOC section.
tocheader fff Formatting of the header within the TOC section.
shiftHeaders [number] Shift all html headers. e.g. if you specify a 2 here, all your <h2> header will be <h4> when passed to htmldoc. Negative numbers are possible, too. Maximum is 6, as html does not provide a deeper hierarchy.
skin [skinName] You can also specify a skin which will be used for generating the docuement. Defualt to plain
titlepg on,off Determine whether title page is generated (defaults to on)
orientation portrait,landscape Allows control of orientation, default is portrait. There are htmldoc comment directives but it is useful to be able to do this external to the content.

fff = heading format string; (See htmldoc documentation.)

See the htmldoc documentation on further information on these parameters. Add your own as well (but do not forget to submit your changes smile ).

Ah well, here's a little example how to pass the parameters:
http://your.twiki.host.net/path/to/twiki/bin/pdf/YourFavoriteWeb/TheDamnTopicToView?skin=print&toclevels=6&bodycolor=eeeeee

-- PatrickOhl - 14 Jan 2003

Of course you can easily 'fake' a twiki pdf skin by forwarding all requests to the other script now. Here is an example template for doing so: >>view.pdf.tmpl<<

-- PatrickOhl - 14 Jan 2003

Updated the pdf script. There are more parameters now. See the table above. These still aren't all parameters provided by htmldoc. Feel free to append some as you need.

-- PatrickOhl - 17 Jan 2003

Can I just add that this add-on is simply brilliant. Just made my day being able to easily provide pdf files of our technical docs stored in Twiki without needing to install Acrobat Distiller.

Thanks for the work

-- NathanReeves - 11 Feb 2003

I've did some documentation a while ago using something called Simple Document Format - SDF for short. It was originally created by "ianc@mincom.com" but seems to have been more or less abandoned (http://www.mincom.com/mtr/sdf/ shows a 404). I believe it was later on adopted by the OpenLDAP team who use it as for code documentation.

SDF uses a somewhat wiki-like base syntax and generates documents of other formats (heard that one before?). The first thing that came to my mind was whether there were any parts in it useable for TWiki-plugins etc, since the SDF tools happen to be written in Perl.

-- ConnyBrunnkvist - 11 Feb 2003


1. I get an error message when I try to view the page with /pdf/ instead of /view/ :
[Wed Feb 12 00:09:42 2003] TWiki.pm: Can't locate TWiki.pm in @INC (@INC contains: ../lib . . /usr/libdata/perl/5.00503/mach /usr/libdata/perl/5.00503 /usr/local/lib/perl5/site_perl/5.005/i386-freebsd /usr/local/lib/perl5/site_perl/5.005 . .) at pdf line 28. BEGIN failed--compilation aborted at pdf line 28.

I tried to mud around with the variables, but to no avail. Any ideas? I am using the Januari 2003 release.

2. I cannot download the view.pdf.tmpl file, as it creates a pdf file in doing so. Maybe a zipped version would be better.

-- ArthurClemens - 11 Feb 2003

TWikiRelease01Feb2003 has a different way of discovering the Perl libs, it depends on the setlib.cfg file located in the twiki/bin directory. With that you need to change other scripts like twiki/bin/pdf:

Change from: Change to:
use CGI::Carp qw( fatalsToBrowser );
  use CGI;
  use lib ( '.' );
  use lib ( '../lib' );
  use TWiki;
  use IO::File;
  use POSIX qw(tmpnam);
BEGIN { unshift @INC, '.'; require 'setlib.cfg'; }
  use CGI::Carp qw( fatalsToBrowser );
  use CGI;
  use TWiki;
  use IO::File;
  use POSIX qw(tmpnam);

It would be nice to package these scripts into a Plugins.AddOnPackage

-- PeterThoeny - 12 Feb 2003

Has anyone gotten this to work on the TWikiRelease01Feb2003? With Peter's input I don't get errors anymore, but I only get an empty PDF (0 bytes) as result.

-- ArthurClemens - 13 Feb 2003

If the script runs well TWiki-wise, it is probably htmldoc's fault. It is very picky about the HTML, especially in the block up to the first <H1 or the titlesheet. I addressed a few of these problems in the PdfPlugin (which I want to post really soon now). You should try to catch htmldoc's error output and work your way up from a simple test page. You might also try pagemode first.

HTH -- PeterKlausner - 14 Feb 2003

Regarding that Zope/Python product mentioned above.. a product I worked on used a similar technique. We converted our data to XML and then ran it through an XSL processor (Apache's Xalan) using XSLT style sheets to produce a document consisting of XSL Formatting Objects. We then ran that through Apache's Fop processor to produce PDF. The process was somewhat slow, but with the advantage that XSL:FO is really designed for printing and gives you the necessary tools to place things intentionally on the page. Also, in theory the tools are replaceable with any other XSL processors, so you're not stuck with one. Basically, what TWiki would need would be code to convert its markup to XML, and an XSLT style sheet for that XML. The rest is already done.

-- ChristopherMasto - 14 Feb 2003

Arthur, I've managed to get the output to PDF working with the latest version of TWiki (Feb03). Didn't actually make any changes outside of what Peter suggested.

-- NathanReeves - 13 Mar 2003

Great add-on, love it!. Really pushes Twiki ahead for documentation/publishing.

Minor mods to pdf script :

  • if 'toclevels=0' use the --no-toc htmldoc option to suppress toc generation
  • include a 'titlepg' param which if set off uses the --no-title= htmldoc option to suppress title page generation.

Update the options params table to document these.

-- RobWalker - 28 Aug 2003

See PdfPlugin for the adhoc version of a plugin; note that above mentioned toc-level stuff is controlled via template.

-- PeterKlausner - 01 Sep 2003

Good you make it a plugin!

I did not get it to work earlier (see above), but it probably chokes on CSS, instead of ignoring it. Is this your experience too, Peter? What exactly are the limitations for page HTML? I understand it manages HTML 3.2. What are do's and don'ts?

-- ArthurClemens - 01 Sep 2003

Attempted to use PdfPlugin with BeijingRelease, feedback recorded in PdfPluginDev.

Added orientation parameter to the pdf script and updated the parameter table to reflect this.

-- TonyMartindale - 04 Sep 2003

Was having problems with the rendering of topics which had TWikiDrawPlugin pictures in them. I hacked our version of the pdf.cgi script to include the following transformation after the topic text is read, and before the tags are processed:

    # mod TwikiDraw tags into conventional attachurl tags
    $text =~ s|%DRAWING{(.*?)}%|%ATTACHURL%\/$1.gif|go;

It's a pretty simple hack - mod the DRAWING tag into an ATTACHURL tag for the .gif file produced by the TWikiDrawPlugin. I'm no Perl/Twiki expert, so I figured it better to include here as a comment for those better and wiser to consider a more robust fix.

-- RobWalker - 16 Sep 2003

Due to my na´ve reliance on TWikiSyndication, I missed that one...

Pulling in pictures works with HtmlDoc 1.8.23, AthensRelease, Accessing pictures via --path is extremely tedious. Fortunately, the newest HtmlDoc pulls images via HTTP. Unfortunately, this requires you to use the full URL. Obviously, the TWikiDrawPlugin doesn't do this and your hack fixes this. I will fix the PdfPlugin to rectify such tags.

As to the do's and don'ts:

  • The summary up to the first <h1> shouldn't contain any HTML at all frown
  • The title page template is very fragile; you have to try and error
  • The body seems fairly tolerant
  • CSS works for, i.e. it is simply ignored. Partly I use this to make online-only stuff invisible. But for tables and other environments I'd like to retain some formatting, so I'm not for TWikiUsingCSS only. Problems might come, if you have style sheets in the body. Never used those. Just linked the sheet from the <head> and added a few class= and <span>s.

-- PeterKlausner - 17 Sep 2003

Anyone got this working with mod_perl on Win32? I'm getting htmldoc running, but it never actually spits data out to the browser. I have to stop and restart apache to get htmldoc to stop. I had it working fine with Cygwin prior to my testing with Mod_perl.

-- NathanReeves - 03 Oct 2003

I got it working on Win32 - had to install the free win32 version of HTMLDOC - otherwise it would spin forever looking for fonts, etc. I lucked out on this by not unlinking the temp html files and trying to manually run htmldoc. I also had a litlte trouble with temp file locations on XP. The one thing that it doesn't seem to be doing is walking the entire topic web - it just renders the first page. Also, I had to use the --webpage and remove the compression arguments in pdfcall.

-- GeorgePeden - 19 Oct 2003

I installed HTMLDOC and pdf.pl successfully, but now I'm having some stylistic issues regarding setting typesize for headlines and body text, as well as changing the document title shown in the header. Does anyone have suggestions?

-- ChristianSchmidt - 03 Mar 2004

The pdf script doesn't work out of the box with TWikiRelease01Sep2004 for two reasons, first that covered by PeterThoeny in Feb 2003 regarding that release having a different manner of discovering the Perl libs. The second reason being that TWiki::getRenderedVersion has moved into Render.pm and is now therefore TWiki::Render::getRenderedVersion. I have uploaded a patch pdf.20040901.patch which fixes both problems. Apply it to the pdf script thus:

patch pdf pdf.20040901.patch

-- DaveKnight - 11 Nov 2004

TWiki::Render::getRenderedVersion is an undocumented function. Please use TWiki::Func::renderText instead.

I have not looked into details, but isn't the functionality now covered by the PdfPlugin?

-- PeterThoeny - 12 Nov 2004

Yes, to some extent. But IMHO, PdfPlugin best suits to produce a book and pdf script is better for a single page. I want both.

PdfPlugin also requires some patches to work in TWikiRelease01Sep2004. I've been preparing patches for both. I doublecheck to use TWiki::Func::renderText before submitting.

Anyway, thanks Dave and Peter!

-- KaoruMaeda - 12 Nov 2004

Am I missing something, or is HTMLDOC no longer Open Source?

-- ChrisHogan - 11 Jan 2005

13 is not always unlucky... just weird... http://www.easysw.com/htmldoc/faq.php?13#13

-- MartinCleaver - 11 Jan 2005

Point to note - htmldoc 1.8.24 has a new CGI feature. on our servers this would dump bad info back and make the plugin fail. We downgraded to 1.8.23 -> this is prior to the CGI features of htmldoc and gave us no errors.

http://www.htmldoc.org/software.php?VERSION=1.8.23

-- TerryRankine - 31 Jan 2005

For what it's worth (and not trying to step on anyone's toes), I've done a significant re-write of the pdf script idea to better integrate it with the TWiki rendering operations and preference variables. If you're interested, I published it at GenPDFAddOn. It's very much a beta version, so I welcome comments, etc.

-- BrianSpinar - 02 Feb 2005

Setting a environment variable in the pdf script, just somewhere before the call of htmldoc, makes the script working again with htmldoc 1.8.24:

$ENV{'HTMLDOC_NOCGI'}=1;

-- HubertWeikert - 08 Dec 2005

This script doesn't work with twiki 4.0.4, because internal functions have moved again. I've attached a patch which makes it work again, as well as improving the rewriting of image links. I went ahead and changed almost all of the TWiki:: function calls into TWiki::Func:: calls in the hopes of improving the script's robustness against future upgrades.

You need to apply the previous patch before applying my patch. Download both the patches, and then say:

patch pdf < pdf.20040901.patch
patch pdf < pdf.4.0.4.patch

... and you should be good to go.

-- AndrewMoise - 15 Aug 2006

Whoops -- someone pointed out that I accidentally uploaded the wrong file as pdf.4.0.4.patch. I've now uploaded the actual patch.

-- AndrewMoise - 18 Oct 2006

Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf TWikiDocumentation.pdf r2 r1 manage 424.2 K 2002-07-11 - 23:52 PeterKlausner Demo print out TWiki Reference Manual
PDFEXT pdf r5 r4 r3 r2 r1 manage 15.3 K 2003-09-03 - 09:47 TonyMartindale A simple 'htmldoc' script.
Unknown file formatpatch pdf.20040901.patch r1 manage 2.9 K 2004-11-11 - 16:43 DaveKnight Patch against pdf to make it work with "Version: 01 Sep 2004 $Rev: 1742"
Unknown file formatpatch pdf.4.0.4.patch r2 r1 manage 10.2 K 2006-10-18 - 21:18 AndrewMoise Patch against pdf to make it work with 4.0.4
Unknown file formattmpl pdfcall.tmpl r1 manage 0.3 K 2002-07-11 - 23:58 PeterKlausner Htmldoc call with tons of options
Unknown file formattmpl pdftitle.tmpl r1 manage 0.8 K 2002-07-11 - 23:59 PeterKlausner Title page referenced in call
Unknown file formatEXT print   manage 6.8 K 2001-05-10 - 16:04 PeterKlausner A clone from view 20001201, v1.2
Unknown file formattmpl printarg.tmpl   manage 0.2 K 2001-05-10 - 16:05 PeterKlausner Formatting argumentes for htmldoc
Unknown file formattmpl printbody.tmpl   manage 0.2 K 2001-05-08 - 14:42 PeterKlausner Template for actual text pages
Unknown file formattmpl printtitle.tmpl   manage 0.6 K 2001-05-08 - 14:42 PeterKlausner Just the title page (if you want one...)
Unknown file formatdiff view.diff r1 manage 2.8 K 2002-07-11 - 23:57 PeterKlausner diff to view from 20011201
Compressed Zip archivezip view.pdf.tmpl.zip r3 r2 r1 manage 14.4 K 2003-02-17 - 10:58 PatrickOhl 'Fake' TWiki skin. (Fowards only). - zipped + img
Edit | Attach | Watch | Print version | History: r43 < r42 < r41 < r40 < r39 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r43 - 2006-10-18 - AndrewMoise
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2015 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.