Tags:
create new tag
, view all tags
While I was introducing someone I work with to TWiki, he wondered if the enumerated lists could use things other than numbers (i.e., letters and roman numerals).

HTML 4.01 has a type modifier for the <OL> tag which can accomplish this, although W3C warns that it this is deprecated and that it should be done with style sheets instead. TWiki doesn't appear to be set up to render from style sheets yet, but it would be a good thing to keep in mind for the future if things become driven by style sheets.

This request even includes the change. In TWiki.pm, add the red line at around line 2611:

# Lists and paragraphs
            s/^\s*$/<p \/>/o                 && ( $isList = 0 );
            m/^(\S+?)/o                      && ( $isList = 0 );
            s/^(\t+)(\S+?):\s/<dt> $2<\/dt><dd> /o && ( $result .= &emitList( "dl", "dd", length $1 ) );
            s/^(\t+)\* /<li> /o              && ( $result .= &emitList( "ul", "li", length $1 ) );
            s/^(\t+)\d+\.? ?/<li> /o         && ( $result .= &emitList( "ol", "li", length $1 ) );
            s/^(\t+)([AaIi])\s?/<li> /o      && ( $result .= &emitList( qq!ol type="$2"!, "li", length $1 ) );
            if( ! $isList ) {
                $result .= &emitList( "", "", 0 );
                $isList = 0;
            }

With this additional line in place, there are now five possible types of enumerated lists:

Character Result Sample
1 Arabic Numerals 1, 2, 3...
A Uppercase Letters A, B, C...
a Lowercase Letters a, b, c...
I Uppercase Roman Numerals I, II, III, IV...
i Lowrcase Roman Numerals i, ii, iii, iv...

If this would be useful, feel free to incorporate it.

-- MarkFeit - 20 Nov 2003

Wow, what a nice simple one line upgrade. Thanks Mark!

I wouldn't be too concerned about start being deprecated in the html spec. As far as I know browser support for controlling the numbering options of lists via stylesheets is still quite limited outside Mozilla/Safari/Konqueror so it will be supported for a long time to come.

-- MattWilkie - 21 Nov 2003

Already in O'Wiki alpha & TWiki alphas :

            s/^(\t+)([1AaIi]\.|\d+\.?) ?/<li> /o && ( $result .= &emitList( "ol", "li", length $1, $2 ) );

Anyone interested in installing a fully working version simply & easily can grab the O'Wiki alpha release. People interested in sticking with a standard code base can grab my script for building a usable install from a standard release & alpha release from TWiki alpha release.

Another problem caused by slow release cycles - reimplementation of features.

-- MS - 22 Nov 2003

One comment on the implementation in the alpha: The changes to emitList chew up some extra CPU cycles evaluating regexps and with the if...else that generates the <OL> tag. This isn't a real big deal in general but could generate extra load for topics having very long lists hosted on heavily-loaded servers. My version of the patch saves some of those cycles and also doesn't require the mods to emitList. (I cut my teeth when CPU cycles and memory weren't cheap, so I'm kind of anal-retentive about things like that. Now that I'm using TWiki at work, I have an excuse to spend a little bit of time making improvements to the code.)

Side note: I did some experimentation with catching and renumbering lists that were already pre-numbered with roman numerals or letter sequences (aa, ab, ac, ad, etc.) but found that it opened some really ugly cans of worms with how some things in the markup are treated. I have a rough idea how to make it work neatly, and I'll tuck it away to work on when I have some time.

    Mark,

    Unfortunately, your code is (benchmarked) slower than the code in the alpha release. Whilst you save cycles in the emitList function you lose them in the pattern match. Rather than have a single pattern match across all of the text you have 2 - which means you're effectively parsing the whole text a second time. (Integrating the \d+ in breaks backwards data compatibility - as I suspect you found.)

    Test data source: cd $TWIKIHOME/data/TWiki06x00 ; cat *txt |grep -v ^.META > ../testdata.txt - size is ~711K
    Benchmark iterations: 5000
    Body of Version 1

       $_=$source;
       # Version 1
       s/^(\t+)([1AaIi]\.|\d+\.?) ?/
  • /o && ( $result .= &emitListStandard( "ol", "li", length $1, $2 ) );
  • Body of Version 2
       $_=$source;
       # Version 2
       s/^(\t+)(\d+\.?)\s?/
  • /o && ( $result .= &emitListNew( "ol", "li", length $1) ); s/^(\t+)([1AaIi]|\d+)\.\s?/
  • /o && ( $result .= &emitListNew( qq!ol type="$2"!, "li", length $1 ) );
  • NB. This is a slight variation of your patch that includes backwards compatibility with existing functionality.

    • emitListStandard is the standard function
    • emitListNew is the same as the standard function, but with the olType conditional commented out.

    Results:

    • version 1 is 5ms per iteration
    • version 2 is 5.2 ms per iteration

    Upping the iters to 20000 gives:

    • version 1 : 5.1 ms
    • version 2 : 5.2 ms
    Not a very scientific benchmark (need more mixed data for it to be truly fair, but will be fairly representative of the content at people's locations), but matches intuition that you're doing 2 pattern matches across the entire code instead of a simple true/false check. When the difference is of the order of 100 microseconds (and the wrong way at that) I'd be inclined to leave the code as is myself if the only benefit is performance smile

    There's other areas of the code that need attention first. (That said, if you get a version with one match that's backwards compatible, that'd be great since I'm certain these figures would then be reversed smile

    -- MS - 22 Nov 2003

    Hm, you're right, although I'd wager that some of those microseconds could be recouped with some tidying up of emitList. Now you've got me interested in making this thing good and fast, so I'm going to have to scrape up some time to work on it. smile

    -- MarkFeit - 22 Nov 2003

-- MarkFeit - 22 Nov 2003

This looks like a sensible enhancement request.

Question: Do we need to worry about compatibility? For example, how likely is this:

   I write this paragraph with three leading spaces. The first
line will be interpreted as a bullet with Roman letter I if we
implement this new feature.

Mark, if you find time to write a solid patch we would appreciate it here on TWiki.org.

    I'm on the fence about this because of the aforementioned cans of worms. My inclination is to let paragraphs like that one be numbered. The definition of the markup pretty much implies that lines beginning with an even multiple of three spaces will get special treatment, and this change would expand on that treatment. One thing I experimented with was matches that would, like the existing code does with Arabic numerals, handle lists that were pre-enumerated with letters or Roman numerals and properly renumber them. For example:

       Markup...         Would render as...
    
       i Rome             i. Rome
       xi Paris          ii. Paris
       mmi Teaneck      iii. Teaneck
       vii London        iv. London
    
       aa Red           a. Red
       ad Green         b. Green
       zap Blue         c. Blue
       boing Plaid      d. Plaid
    

    Because the letters that make up Roman numerals are a subset of the letters that can make up lettered enumerations, you have to match for Romans before letters. Under some circumstances, that could cause unexpected behavior when someone pastes in part of a list that happens to begin with a lettered enumeration that looks like Roman numerals. For example:

       Markup...         Would render as...
    
       b. Peach          a. Peach
       c. Pineapple      b. Pineapple
       d. Grape          c. Grape
       e. Starfruit      d. Starfruit
    
    ...but...
    
       c. Pineapple        i. Pineapple
       d. Grape           ii. Grape
       e. Starfruit      iii. Starfruit
       f. Guava           iv. Guava
    

    If we were able to analyze the list as a whole, it would be trivial to accurately select which way to enumerate it, but that's not somethng I want to take on. Or we could just simplify it and handle 1., 2., 3. as is currently done and only use the alternate numbering schemes when the line matches /^(\t+)[1AaIi]\s/. The more complete version would be nice, but the the simplified one is ...well... simpler.

    Any thoughts?

    -- MarkFeit - 22 Nov 2003

-- PeterThoeny - 22 Nov 2003

Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r8 - 2008-08-14 - RafaelAlvarez
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.