Thursday, 9 September 2010

Manchester new hardware

We are in the process of installing the new hardware. I knew it was going to be compact but one thing is reading on paper that a 2U unit has 48 CPUs and can replace 24 of the old 1U machines, and another is seeing it. The old cluster 900 machines grandiosity half gone: 20 of the new machines replacing 450 of the old one in terms of cores. Our new little jewels. :)

The first 3 rows in each rack are the computing nodes, the machines at the bottom are the storage units. The storage also has become unbelievably compact and cheap. When we bought the DELL cluster 500TB was an enormity and extremely expensive if organised in proper data servers and this is why we tried to use the WNs disks. The new storage is 540TB of usable storage, fits in 9 4U machines and is considered commodity computing nowadays. Well... almost. ;)

Saturday, 4 September 2010

How to enable Atlas squid monitoring

Atlas has started to monitor squids with mrtg.

mrtg uses snmp. So to enable the monitoring you need your squid instance compiled with --enable-snmp. CERN binaries are already compiled with that option the default squid coming with SL5 OS might not, your site centralised squid service might not. You don't need snmpd or snmptracd (net-snmp rpm) running to make it work.

Once you are sure the binary is compiled with the right options and that port 3401 is not blocked by any firewall you need to add these lines to squid.conf

acl SNMPHOSTS src localhost
acl SNMPMON snmp_community public
snmp_access allow SNMPMON SNMPHOSTS
snmp_access deny all
snmp_port 3401

again if you are using the CERN rpms and the default frontier configuration you might not need to do that as there are already ACL lines for the monitoring.

Reload the configuration

service squid reload

Test it

snmpwalk -v2c -Cc -c public localhost:3401 .

you should get something similar to this:

SNMPv2-SMI::enterprises.3495. = INTEGER: 206648
SNMPv2-SMI::enterprises.3495. = INTEGER: 500136
SNMPv2-SMI::enterprises.3495. = Timeticks: (23672459) 2 days, 17:45:24.59

snmpwalk is part of net-snmp-utils rpm.

It takes a while for the monitoring to catch up. Don't expect an immediate response.

Additional information on squid/snmp can be found here

NOTE: If you are also upgrading pay attention to the fact that in the latest CERN rpms the init script might try to regenerate your squid.conf.