[rrs-users] Re: rrs variable-length months

Jim C. Nasby decibel at rrs.decibel.org
Tue Apr 19 21:43:24 GMT 2005


On Mon, Apr 18, 2005 at 12:07:34PM -0400, Mark.Cotner at cox.com wrote:
> Tables should be broken into chunks for easy rolloff and to ease administration burden on large tables.  Chunk size will be specified on each data stream by the user.  DELETE while not forbidden in the architecture is strongly advised against.  Union All Views will be placed in front of the archives for easy querying.  A typical cleanup mechanism would be ALTER VIEW to remove tables about to be dropped, DROP TABLE.  To add tables to the archive CREATE TABLE, ALTER VIEW.  On PG tables optionally could be clustered on an index provided by the user.

What you're describing sounds like table partitioning. There's currently
a project (http://pgfoundry.org/projects/bizgres/) to add this kind of
functionality to PostgreSQL, but it's in the formative stages. In any
case, I'm not sure if table partitioning is directly applicable to RRS;
it seems more like a complimentary technique.

> Rollups can take the form of any DB function AGG or otherwise.  Field names will be derivative of aggfn_origname where aggfn = aggregate function and orig name is the original column name.
> 
> For SNMP counters, and text fields a state table will be maintained that tracks the raw data input.  Only derivative data from this table will be archived.
> 
> Text fields can optionally be normalized within the system, simply by the user checking a box during stream specification.

These features sound something like a cross between materialized views
and RRS. Basically, a materialized view takes a view definition and
backs it with an actual table, which makes retreiving data much faster
in some cases.

> I have to design this system and your project is very similar.  It would be great if we can work together on this, and we can provide some coder support to your stuff, but we will completely understand if you don't like the direction we're proposing.  I think you'll find it scales very well and you can throw tons of data at it.  We have a non-procedureal-SQL version of this working at Cox and have had excellent performance from the system.  One hurdle we had to jump over real quick when the system was initially designed was to get rid of delete statements where possible.  We currently are using merge tables in MySQL to do the equivalent of union all views, but union all views will give us additional flexibility.

I'm definately interested in collaborating. It sounds like your needs
probably go beyond what I'd envisioned for RRS, which might be OK. My
intention for RRS was to essentially eliminate the need for keeping the
backing data around (which BTW you generally can't do if you use
materialized views). It sounds like what you're looking for is a more
typical data warehouse with a star-schema, where you have aggregate
tables 'hanging off' of a very large 'fact table'. If that's the case,
RRS might be able to fit the bill (at least if you're aggregating in the
time domain), though more general OLAP tools (such as what Bizgress will
incorporate) might be more useful. If, on the other hand, you don't have
any need to keep the raw base data around, RRS or something like it is
probably the only way to go.
-- 
Jim C. Nasby, Database Consultant               decibel at decibel.org 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


More information about the rrs-users mailing list