" /> Status for Andrew DeFaria: November 2006 Archives

« October 2006 | Main | December 2006 »

November 29, 2006

Perl Style

A little bit about Perl Style

By and large I feel that massively indented and thus multi level conditional statements are not handled well by most people. You see as your evaluating the code you have to "put on the mind stack" each condition and bear it in mind as you successively consider each of the nestings of code. As such I always "look for ways out" of the script/function/routine that I'm in. The idea is to avoid nesting by handling error conditions and other "if this happens then we're done" first, often die'ing, erroring out, returning the script/function or subroutine. The net effect is that the rest of the code, the code that normally executes, is shifted to the left a level or two. Additionally complicated, repeated or complex code can be written into a well named subroutine and called from the main or other functions when needed. To me this all makes the code a lot easier to read.

As an example, and again, I'm not trying to pick on your script specifically and I realize that this might not be the final version, but in reality it contains only a call to GetOptions and 2 if statements. The first if statement merely calls usage if -h is specified. BTW this could be re-written as:

GetOptions (
  'f=s',      \$facility,
  'p=i',      \$port,
  'a=s',      \$XAID,
  'w=s',      \$password,
  'n=s',      \$projName,
  'g=s',      \$unixGroup,
  's=s',      \$serverName,
  'c=s',      \$cachePath,
  'h',        sub { usage },
) || usage;

IOW rather than define an $h variable and then set it via GetOptions then test it afterwards only to call usage, define an anonymous subroutine can calls usage directly from GetOptions. Or, since anything other than the stated options displays usage (as per the "|| usage") simply drop the "h" line entirely. If the user specifies -h then it's unknown and usage is displayed. Then $h can be removed as well as the if test following GetOptions.

The next if statement tests that all of the above variables have been defined. If so then all real processing happens inside that if (and other nested if's therein). Otherwise usage is called. In following with the "looking for a reason to get out" philosophy how about:

usage unless (
  defined $password   &&
  defined $XAID       &&
  defined $projName   &&
  defined $serverName &&
  defined $unixGroup  &&
  defined $cachePath  &&
  defined $port       &&
  defined $facility
);

Basically this says, "if any of these are not defined call usage" and since usage doesn't return everything inside the old if statement can be shifted to the left. Other opportunities exist to continue to apply this philosophy of "getting out while you can" and reducing the nesting level of the program. For example, inside the old if statement you check to see if the login of GPDB successfully made you an administrator. If not you're gonna stop the script right there. So then how about something like:

my $login = gpdb_login ($XAID, $password);

error "$XAID is not an administrator", 1 if $login !~ /administrator/;

gpdb_login returns a string that will contain "administrator" if you were able to login as an administrator. Here I use error, a routine that I often use (in fact I'd like to publish it to TI as a common utility) that writes an error message to STDERR and optionally exits if the second parameter is not 0.

November 28, 2006

More GPDB fixes

  • Added CSS class for input fields
  • Changed Project names to be drill down links
  • Fixed some bugs in add project. Now checking that project name and owning group are not blank
  • Added Port Ranges to Sites. As implemented it's just a text field with no real enforcement. I think the plan is to eventually augment the DesignSync creation script to check the port ranges from the GPDB Site table
  • Augmented Create Site to accommodate newly added fields. Still need to implement the handling of multiple domains per site. Should a check be implemented to insure that a domain name is not duplicated at any other site?
  • Implemented checking of IP Ranges to insure they are valid IP addresses and that the lower range is lower than the upper range. Still need to check that these ranges do not overlap any other IP ranges.
  • Changed several drop downs to properly sort the items. For example, user names are not sorted by at least first name, sites are sorted by site name, etc.

November 27, 2006

More tables, fields and bug fixes

  • Added PDB_ALIASES and PDB_MIRRORS to GPDB
  • Expanded VOB_TAG to accommodate larger VOB_TAGS
  • Added more fields to PDB_SITES to hold login information that used to be kept in the gpdb_site_list.txt
  • Fixed bug with not displaying/updating phone number

November 24, 2006

Added CSS to GPDB

  • Added CSS to GPDB and re-oriented the web page to use them. This will allow us to control this much better
  • Changed sites page to handle multiple domains per site

November 21, 2006

gpdb_add_vob.pl/gpdb_add_project.pl

  • Created gpdb_add_vob.pl. This script combs through the Clearcase registry and then adds the information to GPDB. It uses a heuristic to attempt to determine what project this vob is associated with. There are outstanding issues regarding ownership of DesignSync, Clearcase and Project objects.
  • Changed gpdb_add_project.pl to handle multiple domains per site

November 20, 2006

Multiple Domain's per site

  • Added PDB_SITE_DOMAINS_MAP to GPDB. This is needed as a mapping table to allow multiple domains per site

November 16, 2006

Sites again/Cron problems

  • Met with Bill and Michael regarding changes to GPDB database. Michael believes we should stick with site codes and change convertdb to do the translation from UK site names to site codes
  • Spoke with Larry regarding site codes and he brought up the point that this would not work with the Embassy model and other external partnerships. As such we are back to site names...
  • Found that cron has a very limited PATH causing gpdb_add_project.pl to fail under cron

November 14, 2006

gpdb_add_project cronned

  • Looked into sites, site codes and differences between UK projects/domains and the current implementation of GPDB
  • Cronned gpdb_add_project

Cronned gpdb_add_project

Donna Ducharme wrote:

GPDB is not current. This project CC2630_DS is in the dallas site registry but is not in GPDB. Have we missed something here?

Short answer (AKA Executive Summary)

It's there now. gpdb_add_project.pl wasn't regularly run. Now it is

Long answer (AKA Engineering Notes)

gpdb_add_project.pl, the script that attempts to add any new DesignSync projects to GPDB, is not run on a regular basis. Why is this? Well because in the past this was simply cron'ed into somebody's crontab. Then the guy left and this broke. The right way to do this is to eliminate dependencies to any employee or, as the case was here, contractor. We've been working on that.

First step was to get gpdb_add_project.pl functioning again. Being as it attempts to do all sites and that it used to do this by using rsh, this broke when the guy left. Why? Well because he was rsh'ing as himself and his account naturally, and rightfully, got disabled when he left. So David Kitch set about to create a generic user, gpdb, and get passwordless rsh login rights to all of the appropriate servers at the sites. Recently that was completed.

Meantime I was busy recoding gpdb_add_project.pl as was put into Clearcase and use strict and use warnings were strapped on but the code as checked in would not even compile! First task was to resolve the problems use strict and use warnings introduced.

Next was to resolve the incorrect utilization of rsh that gpdb_add_project.pl was doing It would do an rsh command and check the return status thinking that it was the return states of the command that rsh remotely executed. That is not the case! Rsh returns the status code of the rsh command itself, not the command remotely executed. Additionally it's pretty inefficient to constantly establish communications with another system via rsh, do one single command then tear down the whole remote channel only to do yet another rsh command shortly. Additionally we wish to move to the more secure and better ssh method in the future.

So I wrote Rexec.pm, a Perl module that creates an object and a connection to a remote system1. It has additional functionality to allow not only rsh access, but ssh and even telnet. It can access another username/password to attempt access with. While passwordless login is preferred and passwords are also handled. Expect is utilized to drive this. Also this gracefully degrades in that if a specific protocol is not specified then ssh is first tried, followed by rsh then finally telnet. Also, the connection is held in the object so that multiple command execution is quick and efficient. Finally it reliably returns the output from the remote command as well as it's status. Handles different shell styles (basically csh style shells and sh style shells) but must be informed ahead of time of which shell to expect. Downside, only handles "standard" prompts (generic users should always be configured with a standard/default prompt).

  1. Why didn't I use a CPAN module for this? Many reasons: It offered nothing above what gpdb_add_project.pl already did - i.e. it returned the status of rsh not the command rsh did. Also it didn't support being instantiated and remaining active for multiple commands. Finally it didn't support any other protocol than rsh so there was no preference for ssh and graceful degradation.

With Rexec.pm in place I set out to make gpdb_add_project.pl aware of it and to utilize it. Meantime David was hard at work getting gpdb passwordless login available at all sites. About a week or two ago gpdb_add_project.pl successfully interrogated the final site, Manchester, and updated GPDB of all known DesignSync projects at that time.

And just today I worked with Michael Tisdel to gain access to a gpdb login to Cashew, the system which will house gpdb's crontab and execute such scripts from there. So, IOW, I just got the ability to automate this and have done so.

Still we have no official release mechanism for this gpdb_add_project.pl yet so I just copied it to cashew:~gpdb and it's running from there. I had to additionally copy the Rexec.pm module there as well as the gpdb_site_list.txt which drives which sites to explore. This needs to change when we official release gpdb_add_project.pl.

There is much more work to do with gpdb_add_project.pl. Reporting is poor (it creates logs in ~gpdb/gpdb_add_project_logs), it does not properly handle nor alert people when a server found in DesignSync is not reachable (We have a GPDB CQ request for that one) and from what I can tell it's checking of IP ranges is not correct.

I've added the following to gpdb's crontab on cashew:

# Andrew@DeFaria.com: Run gpdb_add_project.pl. This will add any new projects
# from DesignSync to GPDB. Note that this runs from right here (~gpdb).This
# is wrong and should be changed to run it from the "standard place".Problem
# is - there is no "standard place" (yet).
#
# Also note that this script logs to ~/gpdb_add_project_logs. I'm not sure
# who will read this or clean this up...
00 00 * * * gpdb_add_project.pl -s gpdb_site_list.txt

Again, there's lots of new stuff, enhancements and changes that gpdb_add_project.pl will have to support as we update GPDB to better support Clearcase and the changes to the database that that requires.

End of "everything you wanted to know about gpdb_add_project.pl but were afraid to ask"....

November 7, 2006

UK -> GPDB

  • Working on convertdb, a Perl script to read the UK database and convert it into GPDB
  • Finished code to convert users. Using de to get info about a user to populate the GPDB fields
  • Worked out problems with sites and got sites being added to GPDB from the UK database
  • Got adding of PDB_CLEARCASE records preliminarily done

convertdb

Started coding a Perl script to attempt to convert the UK database to GPDB. Got the Users done then worked out the Site records. The Projects are harder because they contain both site and project as a pair as well as vob/view information. The goal is to get as much of the conversion done as I can as well as flush out issues. Then, at some point, change mkview_linked and mkvob_db to use this new database.

Apparently mkview_linked and mkvob_db utilize this UK database mainly for the purposes of obtaining defaults or a template of information for the creation of views or vobs. There are site-wide defaults, project defaults and finally command line overrides.

GPDB requirements for mkview_link and mkvob_db

I scanned the scripts mkview_link and mkvob_db in order to determine what exactly it needs from GPDB. Here's what I found. Below field names are bold where table names are both bold and italic.

mkview_linked

  • Obtains the site name in a table called Sites where the domain field = the domain of the current host
  • Gets all of the project and stream for this site
  • Obtains the following fields from a Projects table where the site = site and the project = "_default":
    • site
    • project
    • vob_host
    • vob_path
    • vob_remote
    • pool_postfix
    • msdos_mode
    • snap_to
    • db_check
    • snap_notify
    • default_view
    • umask
    • group
    • view_host: Used in -host for mkview
    • view_path: Used in -hpath and -gpath for mkview ($view_path/$username/$view.vws)
    • view_remote: If present then the view storage is placed remotely. Note this is Unix specific.
    • msdos_view: If present then a -tmode msdos style view is created.
    • config_spec: Path to a file containing this project's default config spec that is then set as the new view's config spec1
    • notify
    • vob_only
    • cachesize
    • snapshot
    • snapshot_workdir
    • stream
  • It then obtains the same file list from a Projects table where the site = site and the project = the project name. The idea here is that _default is the base and the project record then overrides those defaults.
  • It then does some parameter substitutions for the field values above. Occurances of %USER% will be substituted with the current user, %PROJECT% the current project and strings of the form %SCRIPT()% will run the script defined between the (). The result of that will be the substitution value.

Notes

  1. Variable substitution happens on each config spec line too. Occurances of %USER% are replaced with the username (%LCUSER% a lower case username and %UCUSER% an upper case username), %VIEW% with the current view (similar %LCVIEW% and %UCVIEW%), %PROJECT% -> current project and %SITE% -> site (with LC/UC versions too).

mkvob_db

  • Obtains the site name in a table called Sites where the domain field = the domain of the current host
  • Obtains all fields from the the Projects table for this site
  • Creates a $project_homes hash for each project/site combination with all fields from the Projects table
  • Creates a @valid_projects array for each project.
  • For each project, substitutes in to each field that is not defined from the site/project/_default record
  • Performs the same substitution as described above on the various fields.

November 3, 2006

Convertdb

  • Converted UK Users -> GPDB
  • Converted UK Sites -> GPDB
|