[rtg] Fwd: Results with RTG
Drew Weaver
drew.weaver at thenap.com
Mon Mar 9 15:23:21 EDT 2009
YMMV,
But from what we have seen netflow implementation is completely different across 6500, 7600, and GSR 12000, on the 7600/6500 it depends on what line cards and supervisors you have (the data actually differs depending on the cards), etc. It is flighty enough to the tune of it not being worth the hassle.
-Drew
-----Original Message-----
From: rtg-bounces at lists.grdata.com [mailto:rtg-bounces at lists.grdata.com] On Behalf Of Matt Simerson
Sent: Monday, March 09, 2009 3:12 PM
To: rtg at lists.grdata.com
Subject: Re: [rtg] Fwd: Results with RTG
On Mar 9, 2009, at 5:28 AM, Drew Weaver wrote:
> NetFlow is a bit trickier to deal with than SNMP data
Wow. I respectfully disagree. There are approximately 613 ways SNMP
data collection can fail. Different brands of switching equipment,
differences among different models of the same brand of switching
hardware, 32 -vs- 64 bit counters, port up/downgrades, router/switch
port failures, moving physical hardware to different switch ports (and
making sure the billing system knows to collect data from partial
billing period 1 from Port X and data from partial period 2 from Port
Y), failures in external systems that map physical ports to billing
accounts, traffic inbound to your network that never arrives at the
hardware port (because of other failures), and on and on, etc... Not
to mention needing a moderately competent DBA on staff to deal with
the gigabytes of RTG polling data, which takes up the same amount of
space for EVERY port, regardless of whether or not it's passing traffic.
As compared to NetFlow.
With NetFlow, you assign the client their IP addresses. You can move
the IPs around your data center or even to another data center with no
maintenance required. Your border routers feed NetFlow data into a
collector which aggregates it and shoves the aggregated data into a
database.
I can't speak for most NetFlow systems because we wrote our own
collector. And it is EXTREMELY accurate, especially for billing
purposes. I periodically validate it against RTG data that we collect,
as well as empirical tests like downloading a DVD ISO and making sure
my account reflects 4.7GB of data transfer.
NetFlow data is also FAR more useful for managing network events,
including DDoS attacks, routing problems, and other such things. SNMP
data typically has a two polling period wait between when a network
event starts and your network techs have visibility into the problem.
Ten minutes is a LONG time. Because of how NetFlow works, major events
show up in near-real time, making your NetFlow data a far more useful
diagnostic tool.
The fundamental difference between SNMP collection versus NetFlow is
that NetFlow only accounts for data that travels to/from your NetFlow
enabled routers (typically your border routers). If your clients are
passing traffic back and forth among their servers, you probably won't
be charging them for that with NetFlow. ARP traffic will never pass
your border routers, so they won't be billed for that. Because of
those differences, NetFlow billing with always be a percent or two
smaller than SNMP collected data.
When you have a small number of ports to monitor, SNMP based is easier
to manage. When you have many network ports, NetFlow is far, far, far
easier and the couple % loss in bandwidth revenue is more than
compensated by the time systems staff won't be spending managing SNMP
port data.
Matt
> as you might notice that every different level of product from Cisco
> supports it differently (even within the same netflow versions).
> With that amount of varying support for it, wouldn't it be a bit
> hard to trust that as an overall method of calculating billing
> utilization? Of course it is nice to be able to separate the local
> traffic from the external traffic for billing purposes but that is
> pretty much the only benefit of NetFlow over SNMP for billing.
>
> thanks,
> -Drew
>
>
> -----Original Message-----
> From: rtg-bounces at lists.grdata.com [mailto:rtg-bounces at lists.grdata.com
> ] On Behalf Of Matt Simerson
> Sent: Sunday, March 08, 2009 10:56 PM
> To: rtg at lists.grdata.com
> Subject: Re: [rtg] Fwd: Results with RTG
>
>
> If you *must* monitor and bill traffic based on the physical switch/
> router/network ports, RTG is the best game in time. But the included
> scripts are barely adequate.
>
> When I was at Layered Tech in 2006-7, they had outgrown the very basic
> scripts included with RTG. I wrote RTG::Report (on CPAN) which greatly
> simplified the process of extracting billing data, custom reports, as
> well as adding new polling targets (the dist includes a better
> target_maker.pl). My version was also much more efficient, reducing
> the time needed to generate reports from days to hours.
>
> But there is a better way to manage bandwidth data. It is called
> NetFlow, and it's rapidly becoming an industry standard. If you can
> use it, you should. LT employees have discussed the move from RTG to
> NetFlow data numerous times. When I was there, we couldn't because the
> core routers were not capable. They have since been upgraded and I
> wouldn't be surprised if LT moved away from RTG.
>
> As far as better looking graph's, fire up your favorite programming
> language and design one that you think looks great. I recently did
> that using perl and GD::Graph for a project I just finished. The data
> source for that is, you can probably guess, NetFlow.
>
> Matt
>
> On Mar 8, 2009, at 7:30 PM, Harry Marcson wrote:
>
>> We are looking to integrate RTG into our system, because it seems to
>> be the only option for our setup. We are looking to poll about 100
>> switches, connecting about 2000 servers.
>>
>> Our current RTG testing results for 1 switch polling showed us that
>> RTG:
>>
>> 1- Seems to have memory leaks, not new news after reading through
>> the mailing list, but was wondering how people are surviving with
>> this, especially the ones with the bigger setups. did you apply any
>> specific patch such as yahoo rtg (yrtg) or own fixes?
>> 2- Has really boring and bad-looking graphs. Good graphs would be
>> the ones that are from Cacti for example.. If anyone has a better
>> rtgplot.cgi or improved graphing code, please do share it!
>> 3- Bugs in the graphs. A simple example is a server that got a
>> 800Mbps DDOS attack and got nullrouted.. RTG did not move the line
>> near the 0Mbps, when it crashed. The end result is, once the server
>> was restored a few hours later, the graph line went down to the
>> actual usage of about 10Mbps, but showed that during the nullroute a
>> 800Mbps usage.
>> 4- Makes matching the actual switch port with the RTG device id a
>> bit of a hassle. Has anyone come up with a solution for that?
>> 5- Lacks automation when it comes to discovering new devices. Adding
>> switch ips is an easy one, but how to run rtg targetmaker.pl every
>> time a new server is added to a switch, aswell as restarting the RTG
>> process is a more difficult one.
>>
>> I am also very interested in knowing wether any big companies are
>> utilizing RTG for their bandwidth monitoring or billing. I saw in
>> the mailinglist that Layeredtech, Gnax, Savvis are some examples.
>> Any others that would like to convince us that RTG is worth it in
>> the long run?
>>
>> Harry
>>
>> _______________________________________________
>> RTG mailing list
>> RTG at lists.grdata.com
>> http://lists.grdata.com/mailman/listinfo/rtg
>
> _______________________________________________
> RTG mailing list
> RTG at lists.grdata.com
> http://lists.grdata.com/mailman/listinfo/rtg
_______________________________________________
RTG mailing list
RTG at lists.grdata.com
http://lists.grdata.com/mailman/listinfo/rtg
More information about the RTG
mailing list