[rtg] Results with RTG
Brandon Ewing
nicotine at warningg.com
Mon Mar 9 10:46:44 EDT 2009
On Mon, Mar 09, 2009 at 02:58:52AM +0100, Harry Marcson wrote:
> We are looking to integrate RTG into our system, because it seems to be the
> only option for our setup. We are looking to poll about 100 switches,
> connecting about 2000 servers.
>
> Our current RTG testing results for 1 switch polling showed us that RTG:
>
> 1- Seems to have memory leaks, not new news after reading through the
> mailing list, but was wondering how people are surviving with this,
> especially the ones with the bigger setups. did you apply any specific patch
> such as yahoo rtg (yrtg) or own fixes?
Since switching to the .8 (and then .9) poller, I've never had a problem
with memory leaks. .8 includes a patch that fixes target hashing, so HUPs
correctly cause it to pick up new and drop old targets.
Honestly, the biggest reason to move to .8 is the inclusion of the rate
column - I'm more than happy to pay a few cycles on the polling end to save
countless cycles during reporting, even if the plotter STILL hasn't been
re-written to take advantage of the column.
> 2- Has really boring and bad-looking graphs. Good graphs would be the ones
> that are from Cacti for example.. If anyone has a better rtgplot.cgi or
> improved graphing code, please do share it!
Even if you're still using the .74 poller, switch to the .8 or .9 (CVS)
plotter. Sure, some bugs remain (mostly weird characters in legend labels),
but many of them have been fixed, and the additional features such as
aggregation across different RIDs is very useful. Note that if you do
switch, the PHP and Perl scripts included with RTG become worthless until
re-written to support the new format.
> 3- Bugs in the graphs. A simple example is a server that got a 800Mbps DDOS
> attack and got nullrouted.. RTG did not move the line near the 0Mbps, when
> it crashed. The end result is, once the server was restored a few hours
> later, the graph line went down to the actual usage of about 10Mbps, but
> showed that during the nullroute a 800Mbps usage.
You might want to investigate the -z option to rtgplot to force it to insert
0s in the database.
> 4- Makes matching the actual switch port with the RTG device id a bit of a
> hassle. Has anyone come up with a solution for that?
Good front-ends and scripting are common solutions to this. Also, if you
can, make sure that interface ID persistance is enabled on your network
devices.
> 5- Lacks automation when it comes to discovering new devices. Adding switch
> ips is an easy one, but how to run rtg targetmaker.pl every time a new
> server is added to a switch, aswell as restarting the RTG process is a more
> difficult one.
cronjobs + HUP.
> Harry
--
Brandon Ewing (nicotine at warningg.com)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.grdata.com/pipermail/rtg/attachments/20090309/9ff38290/attachment.pgp>
More information about the RTG
mailing list