Tags:
create new tag
, view all tags

Nedit Macro: Search (TWiki) Records

Current Status: (Deleting old updates.) The macro generally works now for large (tested on 1.4 MB) files and small. There was no major memory leak but simply a screwup on my part (I created an infinite loop by never updating Start—thanks to Joerg Fischer for finding that, and sorry for the noise on the discuss@neditPLEASENOSPAM.org list). There are improvements to be made—first on the list is to start a new search from the current postion of the cursor (when the macro is invoked) rather than from the beginning of the file. Other bugs almost certainly remain. Thanks to everyone who helped (or endured my questioning)!

(On my home machine (System12), the program is in file `/nedit_srch.macro.r2. r1 converted the original approach of searching in the file (buffer) with search to an approach of creating a string containing the file and searching that with search_string to avoid what I thought was a memory problem (but turned out to be the infinite loop, mentioned above). (This was done at the suggestion of Eddy De Greef (thanks!). (There is a potential memory problem with macros in Nedit in that garbage collection is suspended during execution of a macro (iiuc).

Other tidbits: return is the command to exit a macro. Next time I write a macro I should develop it in a separate file (which I did) but then load it via the load_macro_file("<pathname>") command (rather than copying and pasting to and from the little macro dialog window).

Another step forward in my quest for an askSam workalike in Linux (aka my offline TWiki-like thing). I store multiple records of information in plain text files with free-format TWiki markup for formatting and separation of records. This macro allows me to search those files for records containing multiple search terms, which makes finding information much easier.

This macro was written in/for Nedit 5.3. After getting the macro to this point, I did some further research on the newer versions of Nedit, specifically 5.5 (which seems to be the current, but which was not shipped by default in Mandrake 10). Nedit 5.5 has a feature called rangesets, which I believe would let me highlight the found terms in a record.

Unfortunately, I have to set this aside for at least a little while (2 to 4 weeks), and am not sure this will be my highest priority when I have time available again. %SECTION{summary}%

See:

Contents

Documentation

Basic Goal

Background: Because there is no askSam for Linux, I've been gradually developing something similar on my own. (Step 1 was finding and using a wiki (TWiki), step 2 was using the TWiki markup in a plain text file (with multiple records per file), edited with Nedit and the folding macros I wrote for Nedit, and sorting while folded.) This is step 3--searching for search terms within a single record.

Currently (but probably soon to change), I've been using TWiki's 2nd level heading markup ("\n---++ ") to separate records in the plain text file(s). A "plain" Nedit search even on multiple terms doesn't help me pinpoint a record--the best approach up to now has been to try to pick the most obscure search term I can think of to minimize the noise.

This new search looks for records (i.e., text between two instances of "\n---++ ") containing all of the requested search terms.

The strategy is as follows:

  • Find an instance of (hopefully) the rarest search term (I've assumed that the rightmost term will be the rarest--it's easy enough to revise the program to use the leftmost term if that turns out to be more likely)
  • Select that entire record (by searching backward and forward for the record separator (default is "\n---++ ") or bof/eof)
  • Search that selected record for the remaining search terms--if all are found, highlight that record, put the cursor at the rarest term, and exit the search (currently not real effective because, with the entire record highlighted you (I) can't see where the cursor is) (also, I'd like to position things so the top of the record is at the top of the page (unless that doesn't show a hit))
  • If all are not found, move to the next record and resume the search for the rarest term. If found, search that record for the remaining terms. (My phrasing here is not 100% accurate, more accurate is: "move past the end of the record just searched (to avoid getting another hit in the same record), and resume the search for the rarest term" (i.e., regardless of record boundaries)--when found, then establish the boundaries of this new candidate record and check for the other terms). And note that if doing a backward search, move past the beginning of the record just searched.

I've also implemented the following "modes" (chosen by buttons on the first dialog screen):

  • Literal: start a literal search for terms entered. Todo: currently from the beginning of the file, but I'm beginning to think it would be better (more intuitive/similar to other searches and help with the current problem to start from the current location of the cursor instead
  • Regex: start a regex search for terms entered (all terms treated as regexes--although I don't expect a problem, I haven't tested this yet). Todo: see above
  • Next: resume a search, looking for the next matching record
  • Previous: resume a search, looking for the previous matching record
  • First: resume a search, looking for the first matching record
  • Last: resume a search, looking for the last matching record
  • Chg RS: change the record separator, in case I want to use a different record separator in a particular file or for a particular reason. This, like sorting after folding on the wrong record separator, can result in a very messed up file, so be careful, and until you are very confident in what you are doing, backup your work before changing this. (Maybe when I've resolved on a more permanent record selector I'll remove this option.)
  • Cancel a search by clicking the window's X

Unfortunately, there are limitations of Nedit (5.3) that keep me from making the program as fancy/clean/intuitive as I'd like, but I may do more as: I learn more, Nedit develops, I develop Nedit, or I switch to another editor.

Because I'm not sure it's of any value, the searches currently do not wrap. Or maybe, more accurately, I expect the behavior to be more intuitive (for me) if the default is not to wrap. The options to find the first or last record, combined with next and previous, let me essentially wrap manually. If I find it of value, I could switch to wrap by default or provide an option.

One limitation (others mentioned or discussed below) is that I apparently cannot assign shortcut keys to the mode PBs mentioned above. A workaround is to create separate macros (probably without the string dialog) for those I'd want to use via a shortcut key (most likely Next, Previous, First, Last, but maybe not Literal, Regex, or Chg RS). I haven't done this yet, but it is trivial (cut and paste, choose logical shortcut keys).

I do use quite a few (8 to 10?) global variables to persist information between searches, especially for continued searches. Most of those globals are "naturally" initialized at appropriate times during the normal sequence of events. Two of them, however, are not, and I had to initialize them in the ~/.neditmacro file.

Program

Initialization

Two variables have to be initialized at the start (of each instance of??) nedit. To do so, create (or append to existing) file ~/.neditmacro and include the following:

# Initializations for Search Record macro
$Rec_sep = "\n---++ " # Note: For TWiki markup, the trailing space is important to 
#                             avoid matching, for example, "\n---+++"
$Srch_strng = ""

Program

Do_srch = 0

Srch_strng = string_dialog("SEARCH TWIKI-LIKE THING BY RECORD\n\nCurrent record separator (RS)
is [" $Rec_sep "]\n\nEnter search terms, with the least likely last\n\nCurrent search terms (for
continued search):\n\n[" $Srch_strng "]", "New Literal", "New Regex", "Next", "Previous",
"First", "Last", "Chg RS")

# Srch_strng = "AbiWord   development  QNX startup venture" # for test
# Srch_strng = "" # for test
# $string_dialog_button = 1 # for test

# Values of the string_dialog_button reflect which button is pressed--if you 
#   add/rearrange buttons you will need to revise the program accordingly.  
#   (Current) meanings of the values of the $string_dialog_button:
# 0 search window closed by operator--exit macro
# 1 Literal--do a search assuming all search terms are literal
# 2 Regex--do a search assuming all search terms are regexes
# 3 Next--repeat previous search, searching forward from last find
# 4 Previous--repeat previous search, searching backward from last find
# 5 First--repeat previous search, starting over from the top of the document
# 6 Last--repeat previous search, last in document (because current search won't wrap)
# 7 Chg RS--change the Record Separator ($Rec_sep)

if ($string_dialog_button == 1) {
# Setup new search: literal 
  $Srch_strng = Srch_strng
  $Lst_succ_rec_begin = 0
  $Lst_succ_rec_end = 0
  $Match_count = 0
  $Doc_len = $text_length
  $Srch_dir = "forward"
  $Cursor = $cursor
  $Srch_type = "literal"
  Start = 0
  Do_srch = 1
}

if ($string_dialog_button == 2) {
# Setup new search: regex 
  $Srch_strng = Srch_strng
  $Lst_succ_rec_begin = 0
  $Lst_succ_rec_end = 0
  $Match_count = 0

  $Doc_len = $text_length
  $Srch_dir = "forward"
  $Cursor = $cursor
  $Srch_type = "regex"
  Start = 0
  Do_srch = 1
}

if ($string_dialog_button == 3) {
# Resume existing search: find next
  $Doc_len = $text_length
  $Srch_dir = "forward"
  Start = $Lst_succ_rec_end
  Do_srch = 1
}

if ($string_dialog_button == 4) {
# Resume existing search: find previous
  $Doc_len = $text_length
  $Srch_dir = "backward"
  Start = $Lst_succ_rec_begin
  Do_srch = 1
}

if ($string_dialog_button == 5) {
# Resume existing search: find first in document
  $Lst_succ_rec_begin = 0
  $Lst_succ_rec_end = 0
  $Srch_dir = "forward"
  $Doc_len = $text_length
  Start = 0
  Do_srch = 1
}

if ($string_dialog_button == 6) {
# Resume existing search: find last in document
  $Lst_succ_rec_begin = $text_length
  $Lst_succ_rec_end = $text_length
  $Srch_dir = "backward"
  $Doc_len = $text_length
  Start = $Doc_len
  Do_srch = 1
}

if ($string_dialog_button == 7) 
# Change record separator--be careful
  $Rec_sep = Srch_strng

#$Srch_strng = "AbiWord   development  QNX startup venture" # for test
#Start = 0 # for test
#$Srch_type = "literal" # for test
#$Srch_dir = "forward" # for test
#$Match_count = 0 # for test
#$Rec_sep = "\n---++" # for test

# abort search if no search terms
if ($Srch_strng == "") 
  Do_srch = 0 
else {
  $Srch_arry = split($Srch_strng, "\\s+", "regex")
  Num_trms = $Srch_arry[]
  if (Num_trms == 0)
    Do_srch = 0
}

if (Do_srch == 1) {
  Eof = 0
  Allmatch = 0

  File = get_range(0, $text_length)

  while ((Allmatch == 0) && (Eof == 0)) { 
    Temp = search_string(File, $Srch_arry[Num_trms - 1], Start, $Srch_type,  $Srch_dir)
    if (Temp == -1) 
      Eof = 1
    else {
      Begin_rec = search_string(File, $Rec_sep, Temp, "literal", "backward")
      if (Begin_rec == -1) 
        Begin_rec = 0 
      End_rec = search_string(File, $Rec_sep, Temp, "literal", "forward") 
      if (End_rec == -1) {
        Eof = 1
        End_rec = Doc_len
      }
      Allmatch = 1
      Record = get_range( Begin_rec, End_rec)
      for (i=(Num_trms - 1); i > 0; i--) {
        if (search_string( Record, $Srch_arry[i - 1], 0, $Srch_type, "forward") == -1) {
          Allmatch = 0
     if ($Srch_dir == "forward") {
       Start = End_rec + 1
       if (End_rec >= $text_length) 
         Eof = 1
     }    
     else {
       Start = Begin_rec - 1  
       if (Begin_rec <= 0) 
         Eof = 1 # Bad name for Eof--at this point it's only intended to end the routine
     }    
   }  
      }
    }
  }

  if (Allmatch == 1) {
    $Lst_succ_rec_begin = Begin_rec
    $Lst_succ_rec_end = End_rec
    $Match_count += 1
    select(Begin_rec, End_rec)
    set_cursor_pos(Temp) # or (Begin_rec) 
    #beginning_of_line()
  }
  else 
    dialog("No match")
}

Nedit Macro Gotchas

  • concatenation requires no operator (i.e., e.g., no "+")
  • if requires {} only if there is more than one statement in the if or else clause--syntax error if present for one statement, logic error if missing for more than one statement. for doesn't have the same problem, not sure about while (not tested)
  • haven't found a statement (break, exit, ...) to leave an entire macro early--probably need a big if or while around entire code. (I'm setting variables (like Do_srch, Eof, or Cancel in various places then testing for them.))
  • all (character) escapes in macros must be doubled (\n becomes \\n, \\ becomes \\\\)
  • udfs (user defined functions) can be defined with "define { }, passed parameters are referenced as $1, $2, ..., but macros must be stored separate from macro menu macros and not nested within other udf definitions--a typical storage location is (~/.neditmacro). Being reluctant to depend on another file (later I had to use .neditmacro anyway), and/or worried about potential performance issues, I avoided a udf, and put the main logic in an if Do_srch.
  • limitations of string dialog: can't show a default value, can't add short-cut keys for buttons (??)
  • debugging is not fun, dialog("message") is useful
  • garbage collection not performed while a macro is running (and maybe searches allocate a lot of memory??)

Future Enhancements / Notes

  • Don't put warnings about inaccurate continued search after editing--the same is true of all editors / searches
  • Revise so cursor points to a successful match for the last search term (first or last in record depending on search direction)--done, but with record highlighted I can't see the cursor
  • select the found record (so that a find is obvious and so that a followup ordinary search within the selection is a search within that record)
  • someday it would be nice to "highlight" each found term in the record--don't think the tools exist to do in Nedit today (5.3), although I should examine the syntax highlighting tools (see below)
  • having gone this far, it seems that something to limit viewing / editing to a single record can't be that far away--can I "spawn" a new window, display a "substring" (the found record) in that window, and "retrieve" changes back to the main file on a "save"/quit? -- Yes, see more below
  • find command to move to the beginning of a line-done--beginning_of_line()
  • test feeding a (the existing search) string to the string dialog before displaying--in any case, consider dropping the verbose option (which would have displayed the current search string)--doesn't work, workaround: show current search terms as part of the "label" of the string_dialog
  • start new searches from the current location of the cursor instead from the top of the file
  • place top of record at top of window, unless that keeps a hit from being visible (look for a suitable positioning command) (Nedit 5.5 has the "scroll_to_line( lineNum )" command which:

"Scroll to position line number lineNum at the top of the pane. The first line of a file is line 1."

Don't know if I might have overlooked that in 5.3 (will look for that and rangesets again in 5.3, otherwise looks like two good reasons to upgrade (plus tabs).

Update: scroll_to_line() is in Nedit 5.3, but, near as I can tell, rangesets are not.

Syntax Highlighting to Show Hits

Doubtful, but maybe--maybe add a hidden character near each hit, highlight words with that hidden character, delete that hidden character later (when?? when search for next record? (What if you don't) on first keystroke after search? How?) Nope, see rangesets (below)--seem like a much better solution.

Update: Rangesets

Rangesets may be the key to highlighting the found text. The way forward might be to add each word as found to a rangeset, change the background color if the find (all match) is successful, and delete the rangeset when any of: next search is started, on user input, and ??? (In any event, a rangeset hanging around is not the end of the world--not like some extra character appended to each found word. Would have to modify the search to find all hits within a record--need to think about whether to do that on the first pass, or do a second pass only if the record does have hits for each term.

I found the information about rangesets in the documentation for Nedit 5.5. I don't recall seeing it in 5.3, but I'll look again, as I may have overlooked it.

Range Sets

A rangeset is a set of ranges. A range is a contiguous range of characters defined by its start and end position in the document. The user can create rangesets, identified by arbitrary integers (chosen by the editor when the rangesets are created), and each range within a rangeset is identified by a numeric index, counting from 1, in the order of appearance in the text buffer. The ranges are adjusted when modifications are made to the text buffer: they shuffle around when characters are added or deleted. However, ranges within a set will coalesce if the characters between them are removed, or a new range is added to the set which bridges or overlaps others.

Using rangesets allows non-contiguous bits of the text to be identified as a group.

Rangesets can be assigned a background color: characters within a range of a rangeset will have the background color of the rangeset. If more than one rangeset includes a given character, its background color will be that of the most recently created rangeset which has a color defined.

...

rangeset_set_color( r, color )

Nedit as askSam

It may be possible to use Nedit almost like askSam:

One macro to:

  • select and copy record to a string
  • create a new file
  • switch focus to window for new file
  • display string / record in new window

Allow editing

2nd macro to:

  • select all (in that new window)
  • copy that record/window/string
  • switch to original window
  • select the original record (if deselected) (maybe lock that window??)
  • replace selection with new string

but pretty hoky and rather risky!

Some Nedit commands that might be useful:

  • write_file(string, filename)
  • append_file(string, filename)
  • get_selection()
  • get_range(start, end)
  • replace_range(start, end, string) * replace_selection(string)
  • new()
  • close()
  • select(start, end)
  • shell_command(command, input_string)--output of command is returned as function value

Old Snippets

Not necessarily debugged

define ResumedSearch {
  if ($string_dialog_button == 2)
    Direction = "next"
  else
    Direction = "previous"  
  if (dialog("Resuming search for" Direction " in [" $Srch_strng "]", "OK", "Cancel") != 1)
    $Cancel = 1
  if ($Doc_len != $text_length) 
    dialog("Text has been edited since the origin of the search, depending on the extent of
editing, this resumed search may miss some records.")
}

define RestartSearch {
  if ($string_dialog_button == 4)
    Direction = "beginning"
  else
    Direction = "end"  
  if (dialog("Restarting search for [" $Srch_strng "] from the " Direction, "OK", "Cancel") != 1)
    $Cancel = 1
  if ($Doc_len != $text_length) 
    dialog("By restarting this search, even though the text may have been edited since the
origin of the search, this restarted search should find all records.")
}

Some of the Brainstorming

I wish I could figure out loops better, I should make note of some things that helped (and flesh them out later, I hope):

  • recognizing the "true" end condition(s): in this case, achieving either an all match, or an eof and not an all match, so then created a while loop to deal with those

Pseudocode:

while not allmatch and not eof {
  if not onematch 
    {eof = true}
  else {
    allmatch = true
    for (i=(len-1); i>0; i--) {
      if not match 
        {allmatch = false}
    } # for i=(len-1)...
  } # else (if not onematch)
} # while

I think there's another end condition in the for loop, but maybe it's picked up properly the next time through the while loop--I need to add in the various variables, set them appropriately, then look at it.

Pseudocode:

while not allmatch and not eof {
  if not onematch 
    {eof = true}
  else {
    set params for this record
    allmatch = true
    for (i=(len-1); i>0; i--) {
      if not match 
        {allmatch = false}
    } # for i=(len-1)...
  } # else (if not onematch)
} # while
# onematch: indicates we found a match for the rightmost (least common?) search term, then we'll "extract" that entire record as a string and search it for the other search terms

# allmatch: indicates we've searched one record and found matches for all the search terms

# eof: indicates we've reached the end of the file without finding either an allmatch or another onematch (record) to search

# Another verbal description: 

Check to see if this is a new search or a continuation of an old search.  

  If a new search, start at the beginning of the file.
  
  If a continued search, check global variables and start search from where last match was found.

Search for a candidate record by finding a record that contains the rightmost (least common?)
search term.  

  If none found, quit and report no (more) matches.  

  If found, then "focus" on that record (by extracting it to a string) and see if the remaining
search terms are contained.  

    If yes, report it as found and quit.  (Before quitting, update global variables to indicate
where the search ended (or should be resumed).)

    If no, update search variables (some global), then (if not eof) look for another candidate
record. 

      If eof, quit and report no (more) matches.

# Search

# (for efficiency) Search for last (least significant, hopefully rarest) search term first -- I could write something to reverse the array

# If found:
# 1. temporarily save cursor location, search backward for last RS (or bof), save location
# 2. restore temporary cursor location, search forward for next RS (or eof), save location
# 3. select that range 
# 4. search for 2nd last search term in that range, 
#    if successful search for 3rd last term, ...
#    if not successful, resume search for last term at next RS

# aside: I think I could also make this work within a selection if I saved the beginning and end point of the selection and later restored it.

# let's write the code (longish) for a two term search:

# it would be nice to reverse the arguments, so I search for [0] first, but

# I need to think about how to find the 2nd occurrence (find next) -- maybe save the address of the last found record, and 3 buttons (Literal, Regex, Next) -- also how to terminate (avoid an endless loop) at the last record, hmm, how about a restart from the beginning, or Next Forward and Next Backward

# Some other useful variables:

Doc_len = $text_length

#if Doc_len changes, editing has occurred--should I force the search to start over, or, should I find records starting at the end, so as I find new items I'm in "virgin territory" (positions haven't changed due to the editing (which (presumably) occurred later in the document), or ???--maybe assume most searches (especially if the searcher asks to continue, he presumably didn't find what he was looking for, and thus not likely to have edited it) result in no editing if "next" is asked for, hence, if editing occurred, force the search to restart, or can we do a fuzzy restart (that might miss something) or add the difference in length to all subsequent (relevant) locations, or recheck $Lst_succ_rec_begin and End to see if it was this record that was changed, and then make adjustments, or ...

Contributors

  • () RandyKramer - 01 Apr 2005
  • If you edit this page: add your name here; move this to the next line; and if you've used a comment marker (your initials in parenthesis), include it before your WikiName.

Revision Comment

%SECTION{last_revision}%
  • %DATE% —

Page Ratings

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r5 - 2005-04-06 - RandyKramer
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by PerlCopyright 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding WikiLearn? WebBottomBar">Send feedback
See TWiki's New Look