Tags:
create new tag
view all tags

Idea: SEARCH With Regular Expression Sort

Spec TBD.

-- Contributors: PeterThoeny - 05 Oct 2006

Discussion

Below disucssion is moved from AutoIncTopicNameOnSave to here.

-- PeterThoeny - 05 Oct 2006

I think that the discussion about sort order begs the real question: Rather than padding numbers out to a fixed length for the sorting...

How about extending SORT to handle numeric order?

  • Much as does UNIX sort --field 4 -n

Instead of getting into field numbers, provide a way of extracting a sort key - e.g. a regexp - and then specifying a sort order on that

  • e.g. s/Item\([0-9]+\).*/\1\t\&/
  • and then sort --field 1 --numeric
?

-- AndyGlew - 02 Oct 2006

Sort numerically by topic name: Not sure how this can be defined in a generic & useful way with a %SEARCH{}%.

-- PeterThoeny - 02 Oct 2006

If you support regexps

  • define a regexp to extract the fields, concatenating them in order from primary through lesser keys
  • concatenate using something standard - tab or the like
  • this defines fields
  • then specify a numeric/alphabetic sort on a field basis.

E.g. Item0-Subject, Item6565-subject

%SEARCH{ topic="Ite*", sort_regexp( s/^Item\([0-9]+\).*/\1\t\&/, field1=numeric}

-- AndyGlew - 05 Oct 2006

This cold be useful for some wiki applications, although a bit complex to use. We should find a spec that is easy to grasp and is flexible. For example, sort with regex could be on topic name, a form field value, or a regex on topic text.

-- PeterThoeny - 05 Oct 2006

Do we need to specify the regular expression? Just specify "numeric" and let the code figure it out. The numeric is really a flag saying sort any embedded numbers as numbers.

I sketched out a test program where I sort a list of items containing either prefixed or postfixed numbers (ie. item1 or 1item). The code then figure out which case and sorted accordingly.

Here is the testdata:
item1
item2
item21
item31
item3
item04
item50
item0005
item100

The Result:
item1
item2
item3
item04
item0005
item21
item31
item50
item100

The test Code:

#!/usr/bin/perl

use strict;
use Data::Dumper;


sub by_numeric {
    my($res);

    if( $a->[0] =~ m/^\d+$/ && $b->[0] =~ m/^\d+$/ ){
        $res = $a->[0] <=> $b->[0];
        return( ($res == 0) ? $a->[1] cmp $b->[1] : $res );
    } elsif( $a->[1] =~ m/^\d+$/ && $b->[1] =~ m/^\d+$/ ){
        $res = $a->[0] cmp $b->[0];
        return( ($res == 0) ? $a->[1] <=> $b->[1] : $res );
    }
} # by_numeric

sub Main {
    my(@data, @split);

    @data = <STDIN>;

    foreach ( @data ){
        $_ =~ s/[\r\n+]$//;
        if( $_ =~ m/^(\d+)([^0-9]+)$/ || $_ =~ m/^([^0-9]+)(\d+)$/ ){
            push(@split, [$1, $2]);
        }
    }
    print STDOUT "in: ", Dumper(\@data), "split", Dumper(\@split), "\n";
    print STDOUT "sort: ", Dumper([ sort by_numeric @split]), "\n";
    print STDOUT "joined:\n", join("\n", map { join('', @$_); } sort by_numeric @split), "\n";

}

&Main();

Is this what you want? Could always be extended to handle embedded numbers if needed.

-- CraigMeyer - 06 Oct 2006

I experimented with extending with order=numeric. And spliting into non-numeric, numeric, whats-left. It seems to do what you wanted. Here are the code fragments;

sub by_numeric {
    return( $a->[0] cmp $b->[0] || # 1st term non-numeric
            $a->[1] <=> $b->[1] || # 2nd term Numeric
            $a->[2] cmp $b->[2]    # Optional 3rd term non-Numeric
            );
} # by_numeric

in Search.pm "sub searchWeb" just before if( $sortOrder eq 'modified' ) add

if( $sortOrder eq 'numeric' ){
    @topicList = map { join('', @$_); } sort by_numeric
                       map { ($_ =~ m/^([^0-9]+)(\d*)(.*)$/) ? [$1, $2, $3] :
                       [$_, '', '']; } @topicList;
   } elsif( $sortOrder eq 'modified' ){

-- CraigMeyer - 06 Oct 2006

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r3 - 2006-10-06 - CraigMeyer
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.