[rrs-users] Re: rrs variable-length months

Mark.Cotner at cox.com Mark.Cotner at cox.com
Mon Apr 18 16:07:34 GMT 2005

Hey Jim,
I'm stuck.  Here's the dilemma.

I don't want to compete with your project and I have very specific needs to achieve the volume of data we're expecting.  However if you feel like our requirements are driving your project where it doesn't need to go, then maybe it would make sense to spawn a sister project.  The sister project could be for MySQL only so that we don't complete in the same space.  My initial intention is to help you with your project and port it later, when MySQL is ready for such a port.

I guess the best answer is to present our requirements and let you be the judge.  Please bear in mind that I'm not trying to get you to do all of our work for us.  I'm planning on doing the lion share of the changes, IF you're comfortable with them, and WITH your permission.  One really nice aspect to all of this is no special permission from mgmt or legal is required if we're helping with an existing open source project, provided of course you want the help.

So here goes . . . these are our requirements . . . .

The mechanics of the archive mechanism need to be simple and well defined.  (Your basic KISS clause)

Support snowflake tables or derivatives of derivatives.

Derivative tables can optionally be passed through the archive mechanism for chunking(see def of chunking later).

Tables should be broken into chunks for easy rolloff and to ease administration burden on large tables.  Chunk size will be specified on each data stream by the user.  DELETE while not forbidden in the architecture is strongly advised against.  Union All Views will be placed in front of the archives for easy querying.  A typical cleanup mechanism would be ALTER VIEW to remove tables about to be dropped, DROP TABLE.  To add tables to the archive CREATE TABLE, ALTER VIEW.  On PG tables optionally could be clustered on an index provided by the user.

Rollups can take the form of any DB function AGG or otherwise.  Field names will be derivative of aggfn_origname where aggfn = aggregate function and orig name is the original column name.

For SNMP counters, and text fields a state table will be maintained that tracks the raw data input.  Only derivative data from this table will be archived.

Text fields can optionally be normalized within the system, simply by the user checking a box during stream specification.

I have to design this system and your project is very similar.  It would be great if we can work together on this, and we can provide some coder support to your stuff, but we will completely understand if you don't like the direction we're proposing.  I think you'll find it scales very well and you can throw tons of data at it.  We have a non-procedureal-SQL version of this working at Cox and have had excellent performance from the system.  One hurdle we had to jump over real quick when the system was initially designed was to get rid of delete statements where possible.  We currently are using merge tables in MySQL to do the equivalent of union all views, but union all views will give us additional flexibility.

Your comments on all of this are very welcome!


-----Original Message-----
From: rrs-users-bounces at rrs.decibel.org on behalf of Tom Flavel
Sent: Mon 4/18/2005 2:44 AM
To: Jim C. Nasby
Cc: rrs-users at rrs.decibel.org
Subject: [rrs-users] Re: rrs variable-length months
On 18/04/2005 01:08:02, Jim C. Nasby wrote:
> I'm cc'ing rrs-users at rrs.decibel.org, since others might be interested
> in this. You might want to subscribe.

Have done

> On Thu, Apr 14, 2005 at 06:11:53AM +0100, Tom Flavel wrote:
> > Hi,
> > 
> > I'm looking to log bandwidth and similar types of data, and RRS looks
> > like it'll do roughly what I need.
> > 
> > My data comes from SNMP: I wrote a function for pgsql which goes and
> > gets the data (internally it finds the IPs and passwords for SNMP, etc
> > from other tables), and materialises it as a table, so you can "SELECT
> > getsnmpdata()" and be presented with the various rows. This is entirley
> > irrelevant, but I thought you may be interested.
> Actually, that's an interesting approach. I'd like to include your code
> if I can. I guess it's time for a demo directory or something...

Certinaly. I'll be releasing the entire system when it's ready, but you
can have just that code, if you want.

That approach has advamtages for security for things like credit card
transactions, too - SELECT doatransaction() would return success or failure,
and do the actual transaction server-side. This means the table itself
dosent need to GRANT SELECT, so there's no way a credit card number could
ever be retrieved over the connection. (I have some SECURITY DEFINER
function which returns the numbers in the form "***********xxx")

> > Now, I'd like to take 95th percentiles of the data per-month. Month
> > lengths are different per-month. My thoughts are, that if the RRD has
> > more than enough rows, I can just ignore the latter couple for shorter
> > months (right?).
> Actually, 'last month' isn't a very accurate label. It should really be
> last 30 days. Right now, the only way could could actually do a real
> 'last month' would be to pull directly from the source data one month at
> a time, which wouldn't be terribly useful.

See below, re source_status

> If you take a look at the rrs table for the last month entry, it's
> defined to keep 168 buckets. Each bucket is 4 hours long, although the
> time_per_bucket field is only used by the code if the rrs doesn't have a
> parent. If you want to know how long each bucket *really* is, you need
> to trace back through the rrs's parents until you come to the top level.
> In this case, the 'last month' rrs (rrs_id 6 normally) feeds off rrs_id
> 4 (last day), which feeds off rrs_id 1 (last hour), which is the top
> level. Each bucket in rrs_id 1 is 1 minute long. Each bucket is rrs_id 4
> is the aggregate of 30 buckets of rrs_id 1, or 30 minutes. Each bucket
> in rrs_id 6 is the aggregate of 8 buckets of rrs_id 4. 8 * 30 minutes =
> 4 hours.

That's how I thought it worked; thanks for clarifying

> Eventually I'd like to have the ability to do an actual last month, but
> I haven't figured out any good way to do it.

See below, re source_status

> How were you thinking of doing a 95th percentile calculation? I think
> that would be an interesting example as well.

I was hoping to make an aggregate function. I think I can have it store
each number as an array, in the current state for the aggregate, and
then sort and return the 5%th element when it's finished. I'm a little
unsure about how to access arrays via postgresql's SAPI, though, or if
they're intended to be sorted.

As far as I understand, an array is arranged consecutivley in memory, so
I think qsort() will suffice.

Maybe inserting in order would be better, or maybe just using the state
for storing the OID of some tempoary table would turn out more effecient;
suggestions welcome.

> > I assume RRS keeps the number of the most recently updated row
> > somewhere, but I cant for the life of me find it... or does the buckets
> > approach (which I am not entirley sure I follow) mean something entirley
> > unrelated is happening there?
> Take a look at the source_status table. It's got the last end time run
> for each source and rrs. You can use that info to find the specific
> bucket.

I thought simply LIMITing the retrieved data by however many days are in the
given month (which would need to be passed in as some parameter, which means
a special-case function prototype :/), assuming the data has enough rows for
all possible month lengths?

Or, if we can just SELECT WHERE start...end, then great. I'm still not
entirley sure how that's structured, or where I'm supposed to be
looking, though.

> > Any pointers are much appreciated, or advice for other approaches.
> > 
> > 
> > (I'll submit patches when I'm done, if you want them: I'm guessing
> > somebody else may want this ability, too)
> BTW, something you should be aware of in version 0.4. It's got code to
> throttle updates when it's running behind. Previously it could take
> hours or more for an update to happen if it was far behind. 0.4 will
> limit how much data is processed. But the problem is that if you add a
> new RRS to an existing system, the update of that new RRS won't work
> properly and it will miss a lot of data. If you need this functionality
> you should probably stick to 0.3 for now (or help me fix the update :P).

Thanks for the warning... I didn't realise it'd miss data for
newly-added RRSes. I'll be creating and removing them frequently, as
customers join and leave (hopefully not many of the latter :)

Ideally, I'd like to fix that, if I can. My plpgpsql skills are
non-existant, though: everything I've written is either pure SQL or C.

rrs-users mailing list
rrs-users at rrs.decibel.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 7483 bytes
Desc: not available
Url : http://lists.decibel.org/pipermail/rrs-users/attachments/20050418/2bf6b0ba/attachment-0001.bin

More information about the rrs-users mailing list