POV-Ray : Newsgroups : povray.off-topic : Database Questions : Re: Database Questions Server Time
3 Sep 2024 11:25:55 EDT (-0400)
  Re: Database Questions  
From: Invisible
Date: 8 Mar 2011 10:01:52
Message: <4d7644e0$1@news.povray.org>
On 08/03/2011 02:21 PM, Tom Austin wrote:

> 1. Is the practice of creating separate databases for each 'job'
> considered good or bad?

That's going to depend on the database engine in question. (Different 
engines have different ideas about what constitutes a "database".) It's 
also going to depend on your idea of what a "job" is - and also what you 
actually want to use the data *for*.

For example, if each job is a separate database, you're going to have a 
lot of fun trying to compute (for example) the average number of days 
spent working on a job. If all the jobs were in a single database, this 
would be a trivial SQL query, but if each job is a separate database, 
you now have a logistical nightmare figuring out how to open and query 
all the jobs, one at a time.

But then again, if you never need to do this kind of computation, the 
fact that it's really difficult to do becomes irrelevant.

> 2. How is creating references across databases viewed?
> Example - one database with definitions of data with many other
> databases referencing the definitions.

Again, it varies by engine as to how hard it is to set up, how reliable 
it is, and any other gotchas involved.

> 3. Normalizing vs Flattening? At what point is a database 'too'
> normalized and does it matter?

Usually the /only/ reason for denormalising a database is performance. 
So you could say a database is "too normalised" if it's taking too long 
to answer your queries. You might perhaps also argue that a database is 
"too normalised" if it has become too complicated to understand - but 
normally you won't be interacting with tables directly, you'll put 
something pretty over the top to make it "look simple".

Q: Why do we even care about normalising?
A:

- If the data is not normalised, you are duplicating data in some sense.
- If data is duplicated, it is extremely likely that the duplicates will 
eventually get out of sync with each other.
- When this happens, you now have conflicting information, and no idea 
which version (if any!) is actually correct.

That said, denormalisation can work. If you have /automated/ systems to 
keep all the copies in sync, it can be reliable. But now you are making 
your job "more complicated". Usually, a fully normalised set of tables 
means less things to go wrong. (But it also means more tables, 
typically, which may make the thing harder to understand...)

> Do you have some thoughts on joins using text data types instead of
> integer numbers?

If those are *supposed* to be integer numbers, you should store them as 
such. If you don't, you're just asking for trouble some day.

On the other hand, if those are *codes* then they probably should be 
text. E.g., telephone numbers, invoice numbers, product codes, anything 
like that, probably wants to be text.

Ask yourself this: are 7 and 007 the same thing, or different things? If 
you want them to be the same, you want integers. If you want them to be 
different, it's text.

Don't even talk to be about storing dates as anything that isn't a date...

> Our core business is surveying utility/telephone poles to put new fiber
> optic cables on them. We propose changes to the pole (moving cables) and
> prepare reports/applications for our client to get on the pole.
>
> We collect and store our data in databases.
> In the past we have used separate MS Access .mdb databases for each 'job'.
> We are developing a new database system to take care of some severe
> shortcomings of the old system.
> Our data has a lot of common 'choose from' items - such as the owner of
> the pole.
> The data consists of pole specific information that doesn't change
> (owner, height, etc) and information that will likely change (the height
> of cables on the pole).

MS Access is a poor excuse for a real database system, but it *does* 
have the distinct advantage of being /simple/. An Access database is 
just a file. You can manage it as such. A database using a "real" 
database engine such as Oracle or PostgreSQL or whatever usually 
involves configuring server daemon processes and installing multiple 
files and setting up authentication and so on. This is far more secure 
and provides much better performance, but it's also far more complicated.

What our main project system does is this: The "jobs" that we are 
currently working on are held in an Oracle database running on a big 
powerful DB server. When a job is completed, the job is "archived" into 
an Access database file. If we ever want to go back and do something 
with the job, it is "dearchived" back into the Oracle database.

Of course, we have [expensive] software that handles all of this 
automatically, needless to say... You have to balance how much 
functionality you actually need against how hard it's going to be to set 
up and run all this stuff. Just like any other computer project.


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.