NNSquad - Network Neutrality Squad

NNSquad Home Page

NNSquad Mailing List Information

 


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ NNSquad ] Re: How far can we trust SNMP metrics?



On Dec 10, 2007, at 5:21 PM, Lauren Weinstein wrote:

In the context of using SNMP for data collection, do we really have
any idea about the reliability of the stats derived from popular
routers?

Well, I know for a fact that the SNMP statistics on "carrier class" routers is fairly accurate, I don't know if that is definitely true for all of the consumer broadband devices though.


Devices whose SNMP accuracy I have tested:
Juniper (M and T series only) -- perfect.
Cisco routers -- basically correct, some models more than others. Cisco is somewhat notorious for not counting properly...
Netscreen -- newer code seems to be perfect.
Force10 -- perfect.
Foundry -- basically perfect.
Black Diamond -- perfect.


Many of the distributed architecture boxes suffer from issues where the data place sends stats to the control plane on some sort of periodic basis -- if you poll before the forwarding engine has pushed the stats up to the CP, you get funny numbers, but averaged out over a while you get more sensible answers.


I have not really tested the consumer broadband routers, but I have access to an Ixia and can do so sometime if needed.




For example, my servers are under very heavy load right now due to interest in the Rogers story. My own SNMP stats are steadily showing both "maximum" and "last" upstream values that significantly exceed the provisioned ceilings on the circuit. This suggests that I'm seeing calculation/buffering artifacts, and these could be matters of significant concern affecting the use of SNMP metrics for the project.

A few questions:

Is this snmpd running locally on the box? Or are you polling a device? If it is local snmpd, what distribution, version, etc? If you are polling a device, what is it?
What OIDs are you monitoring? (ifHCInOctets?)
SNMP V1 or V2c?
How often are you polling? If you have a 100Mbps Ethernet that is pegged, you will wrap a 32bit counter in a little under 6 minutes -- unless you poll more often that that you will get very odd numbers (a 1Gbps link will wrap in around 34 seconds!)
What software are you using for the polling / traffic calculations? Are you sure the software is aware of the polling interval (if you have different pollers and stats processes).
And last question -- are you certain that the provisioned limit is correct? I have seen people who have purchased a rate-limited service (eg: 20Mbps on a DS3) but then discovered that the provider incorrectly set the rate-limiter / forgot to rate-limit...


I'll see about hooking some consumer devices up to an Ixia or SmartBits and validating the counters sometime.

W

Comments? Thanks.

--Lauren--
NNSquad Moderator