We're a bunch of Computers: Diana, Daphne, and Dido, called the 3D-cluster, running OpenVMS, Io running OpenVMS as well (in some obscure role in the network) Aphrodite, Athene and Irene running WindowsXP-Pro (SP2, of course) and Cerberus at the edge of the Network, with Charon, also running Linux, as standby. SYSMGR takes care of us.

Wednesday, August 31

31-Aug-2005 - continued

Finding out connection loss
Checked the logfile on Charon, this is all it tells of last night between midnight and the first access today:

Aug 31 00:02:32 - INFO: 2 ntpdate[25317]: no servers can be used, exiting
Aug 31 00:02:32 - INFO: Error executing ntpdate, result code = 256
Aug 31 09:11:17 - kernel: IP fw-in deny eth0 UDP L=352 S=0x00 I=27161 F=0x0000 T=255

There is a hole between midnight and say 9:00. Really not ANY indication the connection was dropped. No reaction yet on the support site! (there is a new version available: 0.3.4 with some enhancements. But is it worhwhile to download and install? Can the current coinfuguration be used or must evrything be setup again?). Anyway: cleaned Charon's disk: All logs that have been transferred to Diana can be deleted.

IO setup
Io needed some setup: Operator.log is now disabled, since all messages are sent to Diana there is currently no need. Find out that in case Diana is down, operator.log is re-enabled automatically - to be checked in documentation. This prevents the system disk (which is only 1.2 Gb in size) from being occupied by just logs. Perhaps not the right solution but it works for the moment.
Also BIND has been set up - now can run two MASTER databases in the cluster but it may require some extra work on Diana as well. Setup the database as a client one, to synchronize with Diana.


No connection....
This morning, none of the websites was accessable from the Internet, and Telnet failed as well. Checked the ISP site: the network operator has scheduled maintenance between 01:00 and 04:00, causing loss of connectivity for some time. Charon cannot handle this - for some reason - and causes a block... It wasn't communicated by email - no message last night!
Called ISP, it could make sense. Suggested to create a webservice so you could prepare, automatically. Will take it!
...but Ok after 11:00...
Tried 13:00 and connection now exists. First mail message seems te be received 11:03 so expected router works fine after 11:00. Try to find out how often it will retry - lease-time? Time_to_live? Ask at router support site (
...No bogus comments...
It seems some people like to "amuse" others with unsollicited ads. Haha...No more, at least, not automated.


Delayed maintenance
Complaints of HERA's users: Machine is slow, applications crashing (Most used applications asre Internet Explorer, MediaPlayer and MSNMessenger). Well, it's not the fastest, newest and most fancy box, but it should work. So some time was spent to defragment all disks (disk partitions), and updated AntiVirus. Time was up: still need to scan for malware.
But it proved a good thing to do.
Little check on Diana and Io, checked mail via webinterface, and a slight check on logs. No issue whatsoever, apart from the obvious (relay-attempts, store attempts on Anonymous FTP, POST attempts on Apache) that all fail.

Sunday, August 28


Io started
Enabled power line, so IO has been started and added into the cluster (VMS 8.2 but patches still to apply). Done a single patch on Diana, rebooted so cluster IP address moved to IO - and to get it back to Diana, had to reboot IO as well. IP isn't cluster-aware - unless there is a lot extra to be done outside the cluster....
Installed Apache 2.0 on IO to test Pivot (PHP) on that - but it won't run on VMS 8.2: wrong OS version. Well, Apache 2.0 is once again the example of not properly ported (Open Source) software: version dependency and incomptability on application level is not exaclty VMS-like! Anyway, it's in the release notes that 7.3-2 is the latest version of VMS where it is supported. So be it, hopefully it's better with 2.1...(1.3 is, so quite likely it will be).
So VMS 7.3-2 is a requirement on Io. A number of options: there is a disk available so installtion is an option but it takes some time. Thinking of booting it from a disk once the backup of Diana's system disk, but then network configuration data like nodename need to be changed. And I'll loose the backup, which is NOT a good idea. Another possibility is to boot IO as a satellite from Diana, but Io's console needs to be set up somewhat differently.
Or wait until Apache 2.1 is available...Should be released soon - but HOW soon?

Friday, August 26


Web update flaws smashed
Redone the holiday picture slide pages and republished them. Now the links to the originals do work - the vast majority won't fit on the screen, but they can now be downloaded. Perhaps rescale them to fit (1024x768) and prevent download, but there are so many (about 320 to do!).

Still need power to fire up the cluster members Io and Daphne, and set up the fourth one (that's an Alpha-NT machine, at the moment so console needs to be changes as well). Install Java 1.5 on any of these, and test XMPP server and clients? Set up Apache and SMTP anyway (backup systems), and DNS and DHCP - in case these are needed.

Time to install the new router/firewall (low power) so Charon is free for other tasks: could become a Linux or FreeBSD box...(no idea what to use it for, at the moment)

Thursday, August 25


Holiday pictures published
Pictures of this year's holiday have been processed, main page adapted, and all the pages copied to the web. Alas - something was wrong: the wrong main page (20-jul-2005...) and missing links to the orginal big pictures. The latter was easy to solve, but now the links refer to the pages itself. Redo the publishing again tomorrow.
Java 1.5 beta downloaded, may be Jive XMPP server (AKA Jabber) to be installed, as a test. This is a java application requiring java 1.5

Monday, August 22


Busy reorganizing, found some space that is hopefully cooler for the servers: cleared out a storage room, making room for at least the flat systems (Io and Dephne, and the yet-to-confugure system. The latter may become a satellite, not sure yet). Perhaps Diana is to be added to the spot - there is enough room - and all storage units. All is installed but the power. Propably the Internet router is to be placed there as well, and toe switch that holds it all together. Need short cables than, and a switch to connect the console to all machines. What about a terminal server? Digital Networks has a fine box (Conx4) but it's possibly quite costly.
Put the Holiday pictures on Diana, but there is no link yet, since there is a problem with the index pages, and the day's text has to be added.

Monday, August 15


No Pivot....
After consultation with another person that has Pivot on the system it turned out that there might be a problem with PHP on SWS 1.3, that has been solved in 2.0, and was not backported to 1.3 - since that would require a lot of extra work in that version. And due to the requirement of stream_lf files for this version, it has not been, and will not be installed on Diana.
It also showed in the Apache logs: Access violation. Not pinpointed to PHP but very, very likely.
Sad, but that's the end of Pivot. At least, for the moment, bacuse the follow up of 2.0 - handyly named 2.1 - does not have the requirement and can therefore be used. Let's say: Pivot installation is postponed.
....Patches and new Tomcat
Latest patches installed on Diana, and the new Tomcat. Had to uninstall the old version first...
Rebooted - for the patches....
Tomcat will now start in batch! At least - it appears in SHOW SYSTEM. But now Apache fails to start, but that's because file access is denied: mod_jk.conf is owned by SYSTEM and world has no access . Run Tomcat's CONFIG script: current user = owner is SYSTEM. No wonder: installed that way (and the whole tree has been removed). So change that to APACHE$WWW and have all file ownership set - which takes some time, it's a large directory tree - and restarted Tomcat. It's faster than 2.0, it seems, and it uses Java 1.4-2. Good!
Apache is now started as well, but access to Tomcat examples is slow...Perhaps because it's the first access? At end, it does show up, and the example pages also. But just under port 8080. So there is still something to do, it is different than the old version!
Given the time now (after midnight, and with a full working day ahead), that will have to wait a while.

Thursday, August 11


Pivot again
Found some help with Pivot - someone ho has it up and running on VMS. Have done some more to chnage: .txt, .lib. and .10_skin. to be changed, and a directory reference. But it didn't work out right: didn't get up right at all....
Used a method to have files the way they are: multiple dots, so no changes in the files. Now the whole doesn't work at all anymore! That is: not directly: first a page cannot be found, but uisng GO on the address bar moves to the right page. Weird....
test weblog does refeernce a level back, now, so doesn't come up at all!
Will work on it tomorrow - if possible.


PIVOT issues - some investigation...
Done some investigation on the Pivot trouble. Page jsut stops, today the next synday isn't even displayed. Tried to add some output but that lead to even shorter display: just 1,2 and 3... Searchpage is Ok, however!
Submitted question on ITRC, see what comes out...

No further things - just updated the XP machines, and have to download a number of VMS patched - job for the weekend.

Friday, August 5


Maintenance only
Zipped all operator logs of July as on the webdisk (for checking off-line) into OPENJUL2005.ZIP and removed them, and that freed some space on the webdisk. Removed the originals on SYS$MANAGER and that freed some space on SYS$SYSDEVICE. Removed the 8.2 ISO, which freed a even bigger amount.
Cycled APACHE logfiles, thanks to ITRC that gave a hint, already known but never practiced. Zipped the old logs and moved them over to PC for analysis. Jugh. Deleted them as well, freeing even more soace on the system disk.
Scanned all logfiles on SYS$SYSDEVICE, found nothing weird. O yes, some new attempts to dump things on the system (failed), break it down (failed) or get in unauthorized (failed). Indeed, failed all. Did some-one expected otherwise? (I enjoyed my holiday!)
Something more to scan:
The SMTP service logfile contains entries of attempts to reach non-existing users. Ok, might be valid (typos are easily made) but it can be a symptom of illegal access as well. So these need to be scanned as well.
Same for the APACHE access_log file. Not just scanning, it needs to be cycled as well, and this is so easy:
$ @sys$manager:apache$config flush ! to write buffers to disk
$ @sys$manager:apache$congif new ! to create new logfiles
Next, analyze access_log for valid accesses and attempted abuse. So need another scanner, and a log analyzer on VMS (NO JAVA stuff, please). Write my own one, to filter abuse, perhaps?
Pivot trouble:
Looked into this. Found that the calendar on the testpage isn't completed, and viewed the source:

That ends, in the middle of the calendar, because the HTML file just ends:

And statistics crashes, cannot find a files or directory, but that's minor. Blogs should work.
Need a few 9Gb disks, know where to get them but have to tickle the person to think about it. He might be on holiday now.