The power of systems monitoring

DSW recently setup their SPMA (Systems Performance Monitoring & Alerting) system for one of our customers and to be honest, they didn’t quite understand what they would get out of it. After the system was running for less than 1 week, it showed some strange peaks in CPU and network usage.

DSW investigated this which turned out to be a Brute Force attack on their system to try and hack their Administrator password. Their systems were safe as DSW had already advised the customer to disable the Administrator account and use another administrator account for their day to day admin work.

The graphs produced for the customer also included the uptime of the server which showed for this customer that the server had been running for over 200 days. DSW questioned this with the customer as it was a Windows server and we would have expected a monthly reboot for patching. This has highlighted an issue with the WSUS server at the hosting company.

What the customer learned with SPMA is that it will show them a normal profile for the standard system usage but will highlight unusual usage which should be investigated.

Bind 9.8 monitoring with Cacti

After many hours trawling through many sites and trying a few templates it has become apparent that there is a distinct lack of good statistics graphs for BIND 9 and Cacti.

So watch this space for a set of Cacti Templates for BIND 9. In the mean time if you have any suggestions of what you would like to see then please leave a comment on this post and we will endeavor to add as many of them as possible.

The current plan is to create some very simple and configurable scripts and templates. Testing will be done on BIND 9.8 running on Debian 7 with Cacti 0.8.7. graphs will include

  • Incoming Queries
  • Outgoing Queries
  • Network I/O
  • Memory
  • Zone Statistics

Resilient DNS Infrastructure

DSW are currently rebuilding one of their partners DNS infrastructures. After the recent ShellShock vulnerability our partner decided that after patching their servers they would also like to refresh their servers and upgrade to the latest version. They also asked DSW to provide better performance monitoring and alerting on the new environment.

This project is now underway and the new servers will be both virtual and physical, will be located in two different data centers, will be using three different Internet Service Providers and will be deployed over two domain names. This will create a DNS infrastructure with no single point of failure.

The servers will be built with Debian 7 and will run BIND 9.8. Performance Monitoring will be provided by Cacti and will be based on the current BIND 9.7 templates available that will require upgrading to fit the newer version of BIND. Statistics will be produced for the Name Servers as well as for selected individual domain names.

Banking Divestment Project

DSW are currently helping a large bank with a Divestment project with an estimated budget of £1.3bn. Our responsibilities on the project include the design of the data extraction and transport of 2 million accounts and associated data to the target bank. The volume of data to be transported is in excess of 50TB.

DSW Website hack

It has come to our attention that recently our Web site had been hacked and a subdomain had been created for rbscardservices.
The malicious code has been removed and the subdomain deleted.

New Infrastructure Design & Implementation

DSW are currently designing and implementing a completely new infrastructure for one of Scotland’s top independent schools. The design is based on the following –

  • 10 x HP ProLiant DL360p G8, 2 x 6 core CPU, 96GB RAM
  • 2 x HP StoreVirtual 2730 SAN
  • VMware vSphere 5.1U1
  • VMware vCenter Server 5.1U1
  • VMware View 5
  • Windows Server 2012
  • Windows Server 2008 R2
  • Exchange Server 2013
  • SharePoint Server 2013
  • System Center 2012
  • SQL Server 2012

Disaster Recovery Design

DSW are currently working with a Government Agency in Scotland to update and test their Disaster Recovery Strategy. Have you ever asked yourself the question “What would my company/organisation do if our Computer Room/Data Center stopped working for any reason?”