[rtg] RTG database pruning.
Matt Simerson
matt at layeredtech.com
Sun Mar 2 10:49:05 EST 2008
Hello RTGers,
I'm new to RTG so I beg forgiveness should I ramble too far off into
the bushes.
We have a fairly large RTG installation with about 23,000 ports being
monitored. We have a number of polling machines that poll and feed
data back into a central MySQL database. For the most part, this works
pretty well. However, we have a bit of a problem in that our database
size it getting out of hand. We're up to 184GB of bandwidth data and
running out of disk space.
As you can imagine, having that much data causes problems, such as
taking a really long time to generate reports. We replicate the
databases to a mysql slave and offload all the reporting to it. To
give you an idea of what I mean by slow; I just wrote a report that
exports a CSV file containing every interface and how much bandwidth
it passed during a given time range. I chose CSV because our bean
counters can easily import the billing system. The network team and
managers can click the email attachment and open it in Excel.
To generate that report for a one month period takes about 26 hours to
complete. On AMD Opterons with 8GB RAM. I have heavily tuned MySQL to
perform adequately under this workload. Tuning helped. A lot. But it
can only go so far.
So, we're faced with the choice of buying some really expensive
hardware to move RTG onto or find some way to reduce the size oft the
databases. We really don't want to simply discard old data.
I've done a bit of work with rrdtool in the past and got the bright
idea that I could implement some data aggregation. Rather than do
averaging like rrdtool does, I could reduce the data set significantly
by writing a script that condensed older bandwidth data. I chose to
condense an hours worth of bandwidth data into a single record which
reduces the record count by one twelfth.
I wrote a perl script that loops over every interface. It finds the
oldest record and then generates a list of 1 hour intervals. Then it
queries the database for all the records in each 1 hour interval and
generates a summary entry. After successfully writing the summary
entry into the database, it deletes all the entries that comprised
it. Here's an example:
range: 20070228220000 - 20070228230000 - 12 records.
id: 609 count: 15418917957 time: 2007-02-28 22:03:04
id: 609 count: 15322162907 time: 2007-02-28 22:08:04
id: 609 count: 16916402934 time: 2007-02-28 22:13:00
id: 609 count: 15361727173 time: 2007-02-28 22:17:59
id: 609 count: 15906038566 time: 2007-02-28 22:23:09
id: 609 count: 16745419536 time: 2007-02-28 22:28:01
id: 609 count: 16628650088 time: 2007-02-28 22:33:02
id: 609 count: 15940862962 time: 2007-02-28 22:38:00
id: 609 count: 15126799678 time: 2007-02-28 22:42:59
id: 609 count: 16288752555 time: 2007-02-28 22:47:59
id: 609 count: 14757762970 time: 2007-02-28 22:52:59
id: 609 count: 14268752126 time: 2007-02-28 22:58:01
INSERT INTO ifOutOctets_5 (id,counter,dtime) VALUES
(609,188682249452,20070228225959);
DELETE FROM ifOutOctets_5 WHERE dtime='2007-02-28 22:03:04' AND
id=609 AND counter=15418917957;
DELETE FROM ifOutOctets_5 WHERE dtime='2007-02-28 22:08:04' AND
id=609 AND counter=15322162907;
DELETE FROM ifOutOctets_5 WHERE dtime='2007-02-28 22:13:00' AND
id=609 AND counter=16916402934;
DELETE FROM ifOutOctets_5 WHERE dtime='2007-02-28 22:17:59' AND
id=609 AND counter=15361727173;
DELETE FROM ifOutOctets_5 WHERE dtime='2007-02-28 22:23:09' AND
id=609 AND counter=15906038566;
DELETE FROM ifOutOctets_5 WHERE dtime='2007-02-28 22:28:01' AND
id=609 AND counter=16745419536;
DELETE FROM ifOutOctets_5 WHERE dtime='2007-02-28 22:33:02' AND
id=609 AND counter=16628650088;
DELETE FROM ifOutOctets_5 WHERE dtime='2007-02-28 22:38:00' AND
id=609 AND counter=15940862962;
DELETE FROM ifOutOctets_5 WHERE dtime='2007-02-28 22:42:59' AND
id=609 AND counter=15126799678;
DELETE FROM ifOutOctets_5 WHERE dtime='2007-02-28 22:47:59' AND
id=609 AND counter=16288752555;
DELETE FROM ifOutOctets_5 WHERE dtime='2007-02-28 22:52:59' AND
id=609 AND counter=14757762970;
DELETE FROM ifOutOctets_5 WHERE dtime='2007-02-28 22:58:01' AND
id=609 AND counter=14268752126;
I test ran this on one interface and then previewed the changes shown
in RTG.
BEFORE:
and AFTER:
As you can see, the transfer amounts are still calculated and shown,
but none of the rate info. I read through the PHP code, which calls
rtgplot.cgi and then I perused the rtgplot.h and rtgplot.c files
trying to figure out what I'd have to alter to get lines to print. I
didn't get very far.
Anyone got an idea how I might be able to accomplish this? The data
consolidation script is one I'd be happy to contribute back to the RTG
community if others might find it useful. Unless rtgplot.cgi can be
coerced into printing this aggregated data, its usefulness is somewhat
limited.
Matt Simerson
Unix Automation Developer
Email: matt at layeredtech.com
Phone: 214.564.6085
Layered Technologies, Inc.
On-Demand Utility Computing & Hosting Solutions
Learn more>> http://www.layeredtech.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.grdata.com/pipermail/rtg/attachments/20080302/03c43c4c/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rtgplot.cgi.png
Type: image/png
Size: 10064 bytes
Desc: not available
Url : http://lists.grdata.com/pipermail/rtg/attachments/20080302/03c43c4c/attachment-0002.png
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rtgplot-2.cgi.png
Type: image/png
Size: 2697 bytes
Desc: not available
Url : http://lists.grdata.com/pipermail/rtg/attachments/20080302/03c43c4c/attachment-0003.png
More information about the RTG
mailing list