" /> Status for Andrew DeFaria: June 25, 2006 - July 1, 2006 Archives

« June 18, 2006 - June 24, 2006 | Main | July 2, 2006 - July 8, 2006 »

June 27, 2006

Salira Vob Corruption

  • Cleaned up Multisite Packets
  • Cleaned up sons-sc-cc:/Windows/temp and sons-clearcase salira vob cleartext pools due to disk space crunch
  • Ran dbcheck on salira vob to fix corruption
  • Tested changing mastership of a test branch

Time spend: 7 hours

Cleaning up Multisite Packets

First order of business was to attempt to clean up multisite packets that reside in the shipping bays for both sons-clearcase and sons-sc-cc as much as possible. As per my prior work there seems to be huge sync packets to sync, which takes time. I wanted to attempt a chmaster on an older branch to see how that changes from sons-clearcase -> sons-sc.cc. Part of the chmaster involves informing the other replica of the change. This happens through the normal multisite syncreplica. If the bays are full of huge packets then I need to process them first. One problem I hit was running out of space on sons-sc-cc. Normally this is not a problem as there is enough space on the C drive where the vobs reside. But with these huge packets going back and forth I was running out of space. Cleaned up some space and attempt to import all packets on sons-sc-cc. I also attempted to scrub the cleartext pool on sons-clearcase, which has grown to 4 gig! The cleartext pool is a caching mechanism thus since Clearcase can reconstruct the cleartext pool at any time (cleartext is mutable) I figured I could save 4 gig.

Testing chmaster

Tested out that I cannot check out, and back in, and element on the rel_1.0 branch from a view on sons-sc-cc. I then attempted to transfer mastership of the rel_1.0 branch -> sons-sc-cc but received the following error:

[ccadmin] sons-clearcase:ct chmaster SantaClara brtype:rel_1.0@\\salira
cleartool: Error: Branch type "rel_1.0" has branches (with default mastership) that have outstanding checkouts.

Actually there are still checkout on the rel_1.0 branch in, for example, the view YXiu_view_desktop (e.g. salira/neopon/build/makefile).

Ran dbcheck on salira vob to fix corruption

10:40 Pm: Decided to give up on the testing of chmaster and get the vob fixed. Locked salira vob. Started copy of db

:10:43 Pm: Dtarted keybuild procedure. Keybuild failed with:

db_VISTA Version 3.20
Key File Build Utility
Copyright (C) 1985-1990 Raima Corporation, All Rights Reserved

initializing key file: vob_db.k01
initializing key file: vob_db.k02
initializing key file: vob_db.k03
initializing key file: vob_db.k04
processing data file: vob_db.d01, total records = 3555277
 record:       9000
 record:      19000
 record:      29000
 record:      39000
 record:      49000
 record:      59000
 record:      69000
 record:      79000
 record:      89000
 record:      99000
 record:     109000
 record:     119000
 record:     129000
 record:     139000
 record:     149000
 record:     159000

keybuild failed with an exit code of 58. Ran keybuild again... This seems to be going better... Did d01 file. Proceeded to work on the d02 file then (11:07 Pm):

record:   863000
*** db_VISTA database error -901 - system error Bad read 863475 863474processing data file vob_db.d02, total records = 1 record: 1 key file rebuild completed

Hmmm... Doesn't seem like the key file rebuild was really completed. I wonder... Should I try again? Trying again...

Third times a charm they say! keybuild ran to completion but for a while it was touch and go as sons-clearcase was not responding. Now, however, I can import the packets that were stuck... Well most of them:

Applied sync. packet sync_SantaClara_26-Jun-06.02.00.01_5308 to VOB \\sons-clearcase\VOBs\salira.vbs
Multitool.exe: Error: Database identifier (dbid) not found in database: "\salira".
Multitool.exe: Error: Could not get oplog entry with order:2886884 from replica:
China with oplog_id:376595: reference to non-existent ClearCase object.
Multitool.exe: Error: Could not check oplog entry for divergence: reference to non-existent ClearCase object.
Multitool.exe: Error: Cannot apply sync. packet sync_China_26-Jun-06.16.32.42_3292_1 to VOB replica \\sons-clearcase\VOBs\salira.vbs: reference to non-existent ClearCase object

Damn. Ran syncreplica -import again and everything got processed. I'm glad it's processed but I can't help but wonder why I hit these errors...

June 26, 2006

dbcheck

  • Ran dbcheck on salira vob

Time spent: 2 hours

Frank W O'Keefe wrote:

Hello Andrew,

For the error: 06/23/06 07:48:04 db_server(10104): Error: db_server.exe(10104): Error: Database identifier 427883 not foundin "../db__obj.c" line 731.

This could possibly mean there is an issue with the VOBs database. Unfortunately I cannot determine which VOB this is for? I would need you to run a "dbcheck" on the VOB that is reporting this error. Unfortunately I was seeing this error many times in the logs so I cannot tell for which VOB it is reporting this on.

(10104) in the error is the process id that is/was running. This may help in finding the VOB.

I'm pretty sure I know the vob in question - their main vob (\salira).

The following URL is to the instructions on running dbcheck. http://www-1.ibm.com/support/docview.wss?uid=swg21122748

I tried following that by using the method of lock vob, copy the vob database files, unlock vob, dbcheck the copy. Everytime I got a -4 error so I went back to do lock vob, dbcheck, unlock vob.

I was surprised to see some stuff come out on stderr:

[ccadmin] sons-clearcase:/apps/Rational/ClearCase/etc/utils/dbcheck -r1 -a -k -p8192 vob_db > C:\\cygwin\\tmp\\dbcheck.txt

Processing delete chain:  75 nodes on delete chain.
Processing nodes:
+++....

Eventually it finished stating:

Database consistency check completed

169 errors were encountered in 167 records/nodes

Also, I am going to send you a URL to a technote about this PC's heap size. I see messages indicating that you may need to adjust the heap settings for this host.

http://www-1.ibm.com/support/docview.wss?uid=swg21142584

Depending on the dbcheck output, we may need to get a copy of the VOB's db directory but I rather hold off on that request until we see what the dbcheck reports.

I"ve attached the dbcheck output.