[rtg] Fwd: Results with RTG
John Von Essen
john at quonix.net
Mon Mar 9 15:52:18 EDT 2009
Not to get too far OT, but there are implications to using Netflow
for billing.
The biggest issue is SNMP counters reflect an exact bit count for the
interface, and an exact 5min average rate.
With Netflow, you are "interpolating" the rates off those sampling
points. If you are off by a little, and your customer is doing SNMP,
you open yourself to problems. Like they call and say they did 7Mbps
95th percentile, but got billed for 8Mbps.
On Mar 9, 2009, at 3:12 PM, Matt Simerson wrote:
>
> On Mar 9, 2009, at 5:28 AM, Drew Weaver wrote:
>
>> NetFlow is a bit trickier to deal with than SNMP data
>
> Wow. I respectfully disagree. There are approximately 613 ways SNMP
> data collection can fail. Different brands of switching equipment,
> differences among different models of the same brand of switching
> hardware, 32 -vs- 64 bit counters, port up/downgrades, router/
> switch port failures, moving physical hardware to different switch
> ports (and making sure the billing system knows to collect data
> from partial billing period 1 from Port X and data from partial
> period 2 from Port Y), failures in external systems that map
> physical ports to billing accounts, traffic inbound to your network
> that never arrives at the hardware port (because of other
> failures), and on and on, etc... Not to mention needing a
> moderately competent DBA on staff to deal with the gigabytes of RTG
> polling data, which takes up the same amount of space for EVERY
> port, regardless of whether or not it's passing traffic.
>
> As compared to NetFlow.
>
> With NetFlow, you assign the client their IP addresses. You can
> move the IPs around your data center or even to another data center
> with no maintenance required. Your border routers feed NetFlow data
> into a collector which aggregates it and shoves the aggregated data
> into a database.
>
> I can't speak for most NetFlow systems because we wrote our own
> collector. And it is EXTREMELY accurate, especially for billing
> purposes. I periodically validate it against RTG data that we
> collect, as well as empirical tests like downloading a DVD ISO and
> making sure my account reflects 4.7GB of data transfer.
>
> NetFlow data is also FAR more useful for managing network events,
> including DDoS attacks, routing problems, and other such things.
> SNMP data typically has a two polling period wait between when a
> network event starts and your network techs have visibility into
> the problem. Ten minutes is a LONG time. Because of how NetFlow
> works, major events show up in near-real time, making your NetFlow
> data a far more useful diagnostic tool.
>
> The fundamental difference between SNMP collection versus NetFlow
> is that NetFlow only accounts for data that travels to/from your
> NetFlow enabled routers (typically your border routers). If your
> clients are passing traffic back and forth among their servers, you
> probably won't be charging them for that with NetFlow. ARP traffic
> will never pass your border routers, so they won't be billed for
> that. Because of those differences, NetFlow billing with always be
> a percent or two smaller than SNMP collected data.
>
> When you have a small number of ports to monitor, SNMP based is
> easier to manage. When you have many network ports, NetFlow is far,
> far, far easier and the couple % loss in bandwidth revenue is more
> than compensated by the time systems staff won't be spending
> managing SNMP port data.
>
> Matt
>
>> as you might notice that every different level of product from
>> Cisco supports it differently (even within the same netflow
>> versions). With that amount of varying support for it, wouldn't it
>> be a bit hard to trust that as an overall method of calculating
>> billing utilization? Of course it is nice to be able to separate
>> the local traffic from the external traffic for billing purposes
>> but that is pretty much the only benefit of NetFlow over SNMP for
>> billing.
>>
>> thanks,
>> -Drew
>>
>>
>> -----Original Message-----
>> From: rtg-bounces at lists.grdata.com [mailto:rtg-
>> bounces at lists.grdata.com] On Behalf Of Matt Simerson
>> Sent: Sunday, March 08, 2009 10:56 PM
>> To: rtg at lists.grdata.com
>> Subject: Re: [rtg] Fwd: Results with RTG
>>
>>
>> If you *must* monitor and bill traffic based on the physical switch/
>> router/network ports, RTG is the best game in time. But the included
>> scripts are barely adequate.
>>
>> When I was at Layered Tech in 2006-7, they had outgrown the very
>> basic
>> scripts included with RTG. I wrote RTG::Report (on CPAN) which
>> greatly
>> simplified the process of extracting billing data, custom reports, as
>> well as adding new polling targets (the dist includes a better
>> target_maker.pl). My version was also much more efficient, reducing
>> the time needed to generate reports from days to hours.
>>
>> But there is a better way to manage bandwidth data. It is called
>> NetFlow, and it's rapidly becoming an industry standard. If you can
>> use it, you should. LT employees have discussed the move from RTG to
>> NetFlow data numerous times. When I was there, we couldn't because
>> the
>> core routers were not capable. They have since been upgraded and I
>> wouldn't be surprised if LT moved away from RTG.
>>
>> As far as better looking graph's, fire up your favorite programming
>> language and design one that you think looks great. I recently did
>> that using perl and GD::Graph for a project I just finished. The data
>> source for that is, you can probably guess, NetFlow.
>>
>> Matt
>>
>> On Mar 8, 2009, at 7:30 PM, Harry Marcson wrote:
>>
>>> We are looking to integrate RTG into our system, because it seems to
>>> be the only option for our setup. We are looking to poll about 100
>>> switches, connecting about 2000 servers.
>>>
>>> Our current RTG testing results for 1 switch polling showed us that
>>> RTG:
>>>
>>> 1- Seems to have memory leaks, not new news after reading through
>>> the mailing list, but was wondering how people are surviving with
>>> this, especially the ones with the bigger setups. did you apply any
>>> specific patch such as yahoo rtg (yrtg) or own fixes?
>>> 2- Has really boring and bad-looking graphs. Good graphs would be
>>> the ones that are from Cacti for example.. If anyone has a better
>>> rtgplot.cgi or improved graphing code, please do share it!
>>> 3- Bugs in the graphs. A simple example is a server that got a
>>> 800Mbps DDOS attack and got nullrouted.. RTG did not move the line
>>> near the 0Mbps, when it crashed. The end result is, once the server
>>> was restored a few hours later, the graph line went down to the
>>> actual usage of about 10Mbps, but showed that during the nullroute a
>>> 800Mbps usage.
>>> 4- Makes matching the actual switch port with the RTG device id a
>>> bit of a hassle. Has anyone come up with a solution for that?
>>> 5- Lacks automation when it comes to discovering new devices. Adding
>>> switch ips is an easy one, but how to run rtg targetmaker.pl every
>>> time a new server is added to a switch, aswell as restarting the RTG
>>> process is a more difficult one.
>>>
>>> I am also very interested in knowing wether any big companies are
>>> utilizing RTG for their bandwidth monitoring or billing. I saw in
>>> the mailinglist that Layeredtech, Gnax, Savvis are some examples.
>>> Any others that would like to convince us that RTG is worth it in
>>> the long run?
>>>
>>> Harry
>>>
>>> _______________________________________________
>>> RTG mailing list
>>> RTG at lists.grdata.com
>>> http://lists.grdata.com/mailman/listinfo/rtg
>>
>> _______________________________________________
>> RTG mailing list
>> RTG at lists.grdata.com
>> http://lists.grdata.com/mailman/listinfo/rtg
>
> _______________________________________________
> RTG mailing list
> RTG at lists.grdata.com
> http://lists.grdata.com/mailman/listinfo/rtg
More information about the RTG
mailing list