Thursday 23 January 2014

Network Guru Fails at Monitoring his own Bandwidth

Do I come on here to bemoan the seeming incompetence of others a lot?... Because I have  just had an annoying 20 minutes talking to one of our "Network Experts", there's a server box running Ubuntu, its actually a virtual machine, and it was being blamed for crippling the network on the host machine - which has to run several other Windows Server 2008 Virtual Machines.

For some reason, this came across my desk, because someone has decided that though I can't advise the company on Linux server usage or adoption, I can be called into sort out the problems other fools create.

So, this server was running, the administrator chap is sat there with PuTTY connected to it and he's saying there's too much network traffic coming from the box.

Looking at the process list the machine is doing nothing, when I ask for one of its users to be made to use it, the CPU usage is a time flitter and then the usage is over, it appears to me the Virtual machine host is waking the VM, using it, and closing it.

The processes being used are simple webservices, so there should be no network traffic unless someone is using the server.

So, I ask.. "What's your idea that this machine is causing the issue?"  Very smugly the admin presented me with an A4 sheet with two graphs, one shows all his servers running with the Ubuntu machine present, and there's a whole load of traffic going on... The next is the same graph, with the Ubuntu box removed, and there's nearly no network traffic.

Then within PuTTY connected to the box he opens iptraf...


And he points to the statistics and declares, "See, this is running multiple kilobytes a second"...

Lets recap... "This machine is crippling our network usage on this box".... "using up kilobytes a second"... If any of you don't grasp how absurd this is please stop reading now, because the guy was clearly serious.  He's not that old, he's younger than me, surely he realises kilobytes a second on a modern gigbit backbone is nothing?  No, he seriously wants the ubuntu box silent.

So, what's causing this tiny trickle of bandwidth... Of course, he's connected to this machine with PuTTY...

"Close this window and take me to the machine in the server room"

In we go, and from the console plugged into the machine I run iptraf... look what we see now...


Yes, its zero...

When he is running his measures and checks he's connected over the network to the machine, he's been measuring his own bandwidth administrating the machine, and apparently the kilobytes per second that took were too much for him, I feel like this guy should be thrown from the building, not be earning far more than me in the "superior" position of "network guru god" which hs holds...

So knowing the machine is idle, I wonder what causes the bandwidth, so with the iptraf still running, I call up the user and ask him to get working again... and sure enough there's little trickles of bandwidth from the Ubuntu box, no spikes, no major blocks of transmission.  Hence after a few minutes I conclude the Ubuntu box though active, is only taking a tiny amount of the total bandwidth available to the machine, and so I start to look at the other machines...

And with just perfmon on the two Windows Server 2008 boxes with the user operating his end, I can see that the Windows boxes are spiking their network traffic, just with task manager you can see one box taking 20% the total available to it, and the other over 45%... This is a gigbit connection, and 20% of it is taken for maybe 30-50 seconds each minute and then goes quiet, then the 45% hog is there for maybe every 10 seconds of each minute.

They're scheduled tasks, as the users input items into their programs, the programs process and send out instructions, e-mails and update other databses.  The trickle of data I/O to the ubuntu machine is just such a feed, the Ubuntu box sucks/queries data from MS SQL Server on one of the machines and squirts it into MySQL where a very old (like 8 year old) unmaintained program written in C with MySQLConnector picks it up and uses sendmail to e-mail a load of people.  Such notifications take maybe 0.25 seconds and run every 5 minutes, the amount of data is small, and looking at the physical name of the Ubuntu box "e-mailer relay" I think it's safe to say its not using a lot of bandwidth and the other machines are to blame.

How I educate the "network expert"... This is my next task.

No comments:

Post a Comment