POV-Ray : Newsgroups : povray.off-topic : Database Questions : Re: Database Questions Server Time
3 Sep 2024 11:23:17 EDT (-0400)
  Re: Database Questions  
From: Darren New
Date: 8 Mar 2011 12:30:43
Message: <4d7667c3$1@news.povray.org>
Tom Austin wrote:
> 1.  Is the practice of creating separate databases for each 'job' 
> considered good or bad?  Is this answer different based on using MS 
> Access .mdb files vs using a MySQL Database engine?

As others have pointed out, if you're ever going to want to do any sort of 
cross-job statistics (like knowing if the same person worked on two 
particular jobs, etc) then you need to have one database.

The only time I've seen multiple databases with identical schemas used that 
was *not* a f'up, it was because it was a software-as-a-service and each 
customer got their own database and essentially managed it themselves. I.e., 
if you can't imagine installing each database at a separate company, you 
probably want one database.


> 2.  How is creating references across databases viewed?
>       Example - one database with definitions of data with many other 
> databases referencing the definitions.

You really can't do that. That's exactly why you put everything in one 
database.

That said, I think some databases let you make cross-database references, 
but ... really, just don't do that.


> 3.  Normalizing vs Flattening?  At what point is a database 'too' 
> normalized and does it matter?  Do you have some thoughts on joins using 
> text data types instead of integer numbers?

Each table that does not necessarily have any references to its rows is 
representing a separate real-world entity. For example, you might have a 
customer, a sale, etc. Subsidiary tables such as individual lines of a sale, 
might be separate in tables, but they won't represent a separate entity. 
(You can usually tell because the row in the subsidiary table will have an 
arbitrary primary key (if the PK isn't just a combination of FKs) *and* a FK 
that's exactly the PK of what it's a subsidiary table to.) That, IME, is 
approximately the right way to think about it and break it down. Build the 
DB as tables of real-world objects and events, then hook them together 
appropriately. You'll almost automatically be "normalized enough". The time 
you get denormalized is when you start thinking about how you're going to be 
processing the data before you think about the actual data, and you start 
doing joins in your head and encoding them into the schema.

Another good way of working it is to see if there's any combination of 
columns that can be a primary key that isn't just some arbitrary number. I 
built an entire database designed to track people's financial accounts (on 
behalf of a bunch of *our* customers) for a stored-value web site that 
didn't have a single auto-increment field in it. (Granted, our customer 
supplied the identifier for *their* customer, so without that I would have 
needed one auto-increment field.) But everything else was keyed off that. 
The credit card number was the primary key for the credit card. The sale's 
primary key was the credit card number and the UTC timestamp of when we made 
the sale. The end-user's PK was our customer's PK concatenated with the 
identifier we had assigned to our customer. In other work, places were keyed 
by the three-letter code of the airport closest to the place (i.e., that's 
how we identified cities), radio stations were identified by their location 
and frequency, etc.  It's not always possible to find such a key, but to the 
extent you can, it tells you a whole bunch about the meaning of the record. 
And if you use that and find it too slow, *then* it's easy to change it to a 
simple integer identifier.

As for text, I've found that a very good trick for readability is that a 
CHAR(4) will index just as fast as an INT.  So if you want the database to 
be both fast and readable with codes, you use codes like 'SALE', 'RFND', 
'VOID', 'CUST', 'MNGR', 'EMPL', etc.  Then, if you want, you can have 
another table mapping these codes into a varchar(250) or something to put in 
drop-down lists.

If you're talking about text or integer for other things, ask if there's any 
possibility the text will ever change. Is it the name of a company or 
person? A phone number or zip code? If so, it needs to be in a table and not 
part of any key.

> We are developing a new database system to take care of some severe 
> shortcomings of the old system.

What are the shortcomings?

> Our data has a lot of common 'choose from' items - such as the owner of 
> the pole.
> The data consists of pole specific information that doesn't change 
> (owner, height, etc) and information that will likely change (the height 
> of cables on the pole).

These last two should probably be in separate tables that you join at 
runtime. (Especially if you're using MySql, where differnet "engines" give 
you different capabilities.)

-- 
Darren New, San Diego CA, USA (PST)
  "How did he die?"   "He got shot in the hand."
     "That was fatal?"
          "He was holding a live grenade at the time."


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.