Why MS Word is a bad HTML editor
- Described a little bit about why MS Word is not a good HTML editor
By and large I feel that massively indented and thus multi level conditional statements are not handled well by most people. You see as your evaluating the code you have to "put on the mind stack" each condition and bear it in mind as you successively consider each of the nestings of code. As such I always "look for ways out" of the script/function/routine that I'm in. The idea is to avoid nesting by handling error conditions and other "if this happens then we're done" first, often die'ing, erroring out, returning the script/function or subroutine. The net effect is that the rest of the code, the code that normally executes, is shifted to the left a level or two. Additionally complicated, repeated or complex code can be written into a well named subroutine and called from the main or other functions when needed. To me this all makes the code a lot easier to read.
As an example, and again, I'm not trying to pick on your script specifically and I realize that this might not be the final version, but in reality it contains only a call to GetOptions and 2 if statements. The first if statement merely calls usage if -h is specified. BTW this could be re-written as:
GetOptions (
'f=s', \$facility,
'p=i', \$port,
'a=s', \$XAID,
'w=s', \$password,
'n=s', \$projName,
'g=s', \$unixGroup,
's=s', \$serverName,
'c=s', \$cachePath,
'h', sub { usage },
) || usage;
IOW rather than define an $h variable and then set it via GetOptions then test it afterwards only to call usage, define an anonymous subroutine can calls usage directly from GetOptions. Or, since anything other than the stated options displays usage (as per the "|| usage") simply drop the "h" line entirely. If the user specifies -h then it's unknown and usage is displayed. Then $h can be removed as well as the if test following GetOptions.
The next if statement tests that all of the above variables have been defined. If so then all real processing happens inside that if (and other nested if's therein). Otherwise usage is called. In following with the "looking for a reason to get out" philosophy how about:
usage unless ( defined $password && defined $XAID && defined $projName && defined $serverName && defined $unixGroup && defined $cachePath && defined $port && defined $facility );
Basically this says, "if any of these are not defined call usage" and since usage doesn't return everything inside the old if statement can be shifted to the left. Other opportunities exist to continue to apply this philosophy of "getting out while you can" and reducing the nesting level of the program. For example, inside the old if statement you check to see if the login of GPDB successfully made you an administrator. If not you're gonna stop the script right there. So then how about something like:
my $login = gpdb_login ($XAID, $password); error "$XAID is not an administrator", 1 if $login !~ /administrator/;
gpdb_login returns a string that will contain "administrator" if you were able to login as an administrator. Here I use error, a routine that I often use (in fact I'd like to publish it to TI as a common utility) that writes an error message to STDERR and optionally exits if the second parameter is not 0.
The Perl debugger is one of those valuable tools that surprisingly it seems few Perl coders know well. Here are some quick tips on using the Perl debugger. First a few explanations about commands I tend to use:
- s
- Single step. Step to the next statement stepping into any subroutines (where the source file is known and accessible).
- n
Step over - if the line contains a call to a subroutine then this will step over that subroutine. r Return from subroutine - if, say you accidentally stepped into a subroutine or if you just want to return, R Rerun - start your Perl script again in the debugger with all the parms you started with. q quit p <variable or expression> Will print the contents of a variable or expression. Expressions can be Perl expressions including calls to subroutines. You can, for example, do "p 'There are " . scalar @foo . ' lines in foo'; x <variable or expression> Like p above however p will simply print out HASH for hashes whereas x will format them out. Also x will print out "undef" for things that are undefined yet p will print nothing for them. l (ell) List the next windowSize lines (see below). Use "l <n>" where <n> = a line number to list that line. v <n> View lines around <n> V <package> List exported subroutines and variables for <package> (e.g. V MyModule will is all stuff exported from MyModule). f <filename> File - switch to another file. (e.g. f MyModule) and the debugger switches to viewing MyModule.pm. c <n> Continue to line <n>. If n is not specified then just continue until the next break point or the end of the script. Continue is like setting a temporary break point that disappears when you hit the line. b <n> <condition> Breakpoint - set a break point (or b <n> $name eq "Donna" which will break at line <n> iff $name is "Donna" (evaluated when the debugger gets to line <n>))
Also, at the Perl db prompt you can type in any Perl. So, for example, I often work out regex's that way. I'll be debugging a Perl script and stepping up to something like:
10==> if (/(\d*).*\s+/) {
11 print "match!\n";
12 $x = $1;
13 }
Then I'll type in stuff like:
DB<10> if (/(\d*).*\s+/) { print "1 = $1\n"; } else { print "No
match!\n"; }
No match!
DB<11>
Then I can use the command history (with set -o emacs at the shell before executing perl db emacs key bindings work for me) to edit and enter that perl if statement changing the regex until it works correctly. This way I know I got the right regex. Copy and paste the new, tested, regex from the debugging session into my code then "R" to reload the debugger.
Or you can say call an arbitrary subroutine in your script:
DB<2> b Rexec::ssh
DB<3> p Rexec::ssh
Rexec::ssh(/view/cmdt_x0062320/vobs/cmtools/src/misc/GPDB/bin/../../../../lib/perl/Rexec.pm:60):
60: my $self = shift;
DB<<4>>
The "p Rexec::ssh" says to print the results of the following expression. The expression is a function call in to the Rexec module for the subroutine ssh. Since we just set a break point there in the previous debug command we break at the start of that subroutine and can then debug it. Note you don't want to "c Rexec::ssh" because that would continue the actual execution of your script and only stop at Rexec::ssh if that routine was actually called. Viola, you just forcefully caused the Perl interpreter to branch to this routine!
Another thing I'll frequently do is set or change variables to see how the code would proceed if the variables were correct (or perhaps incorrect to test error conditions). So let's say a forced execution of the subroutine Log like the above:
42 sub Log {
43:==> my $msg = shift;
44 print "$msg\n";
DB<23> s
main::Log(EvilTwin.pl:45): print "$msg\n";
DB<24>$msg = "Now I set msg to something I want it to be"
DB<25>s
Now I set msg to something I want it to be
main::Log(EvilTwin.pl:47): return;
DB<25>
There are all sorts of good reasons to examine (p $variable) and set ($variable = "new value") variables during debugging.
Finally put the following into ~/.perldb:
parse_options ("windowSize=23");
This sets the window size to 23 so that 'l" lists the next 23 lines.