|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
I am planning to write a C++ windows-based application to retrieve and
analyse some data. I imagine my data to be stored in some sort of database
table, with each row having perhaps 10 pieces of data (some text some
integer) plus a unique ID (this is generated for me). I expect absolute
maximum 1000 rows of data to be added per day, so after a year or so we're
talking a few hundred thousand rows in this "table", not more than a
million.
The type of analysis I will need to do will be fairly straightforward, like
taking averages and sums of each bit of data over all (or subsets of) rows,
and maybe even some simple filtering but nothing fancy.
My question is, should I be looking to use some external database engine to
do the backend work here for me, or can I get away with just using STL
containers like "set" or something, with a struct that holds my data?
Anything else I should think about before deciding which way to go?
If a database engine will be better, any recommendations of which one? An
easy to install/learn one would be better, and it must be free to distribute
with my program and of course easily accessible from C++. I know a *tiny*
bit of SQL but I think I could learn what I need quite quickly.
And if you think I can just do it in C++ without using an engine, is using
"set" from STL the best way? What is a practical limit to the size (in MB)
of a set I should be manipulating/loading/saving. I'm thinking even a
million rows times eg 256 bytes should be fine for an application nowadays
to load/save and work with in memory?
Any other thoughts?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott schrieb:
> The type of analysis I will need to do will be fairly straightforward,
> like taking averages and sums of each bit of data over all (or subsets
> of) rows, and maybe even some simple filtering but nothing fancy.
This can easily be done inside the database with SQL. There are so
called aggregate functions like sum(), avg(), etc.
> My question is, should I be looking to use some external database engine
> to do the backend work here for me, or can I get away with just using
> STL containers like "set" or something, with a struct that holds my data?
Doing as much as possible inside the database will be faster.
> If a database engine will be better, any recommendations of which one?
> An easy to install/learn one would be better, and it must be free to
> distribute with my program and of course easily accessible from C++.
The give PostgreSQL a try. It has an C++ API too:
http://www.postgresql.org/docs/6.5/static/libpqplusplus.htm
So long,
Bonsai
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> The give PostgreSQL a try. It has an C++ API too:
>
> http://www.postgresql.org/docs/6.5/static/libpqplusplus.htm
That looks quite nice, it seems that libpq++ has been superseded by libpqxx,
which doesn't seem to work very well on Windows. But anyway I used the
standard C API and got it working ok so far. It also looks like I should be
able to do a silent install of PostGres during installation of my app which
is cool.
I just need to revise some SQL now and first of all work out how to create
databases and tables from my code.
Thanks!
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott <sco### [at] scottcom> wrote:
> The type of analysis I will need to do will be fairly straightforward, like
> taking averages and sums of each bit of data over all (or subsets of) rows,
> and maybe even some simple filtering but nothing fancy.
While I know only little of SQL, that sounds to me like it could be
perfectly doable with a database engine and its SQL frontend directly.
SQL has been developed precisely to be an fast&easy way to retrieve and
filter data from a database, and perform diverse operations on that data.
"Stored procedure" is probably a term which will give you useful information
about this whole subject.
If you need to output the data in some special format, after it has been
processed, I really don't know what tools, if any, SQL offers for this.
However, if all you need is to simply retrieve some information (filtered
and processed by the SQL server itself) from the database and then write
it in some special format, perhaps a language which has direct support for
interacting with SQL servers could be easier than C++. (I'm thinking about
PHP, which is somewhat similar to C++ in syntax, and thus more easily
approachable by a C++ programmer than more exotic languages.)
--
- Warp
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott wrote:
> I just need to revise some SQL now and first of all work out how to
> create databases and tables from my code.
You do it with SQL. Probably something along the lines of
CREATE DATABASE mystuff;
CREATE TABLE mytable (
somenumber INTEGER NOT NULL,
comestring VARCHAR(50) DEFAULT '',
primary key somenumber
);
Look up those keywords.
You'll also need to figure out how to connect in the first place. Grep the
manual for "connection string", which is usually how it's described. It
tells basically the address of the server (IP & port, usually) plus user
plus password.
Warp wrote:
> If you need to output the data in some special format, after it has been
> processed, I really don't know what tools, if any, SQL offers for this.
SQL doesn't offer too many tools as such, i.e., not in a portable way.
Particular engines might give you the option of generating XML or CSV or
some such. If you can generate XML, you could run it thru an XSLT script and
output pretty much whatever format you need without writing anything but
shell scripts, if you bashed it hard enough with a baseball bat (or with
bash). If your language has the equivalent of system() and you don't mind
being kludgey, you could put the queries on the command line and output to a
CSV and parse that up.
You could also use metakit (from equi4.com) if you want a nice embeddable
database that isn't SQL (it's not even normalized). Just a plug for a great
product. It's the sort of thing a lot of small wiki servers use for their
database back end. A good replacement for Access. Lots of interfaces from
lots of languages. Not bad if you don't need a lot of relational joins and
such.
You can also use "SQL Server Everywhere", which is Microsoft's SQL server in
a form that you link it into your program. There's another one called "star
db" or "db star" or "SQL*" or something, that's from Sun I think, that isn't
windows-specific. This will save you the hassle of installing SQL servers
everywhere you want this stuff to run, if that's a problem for you. One less
thing to go wrong.
Don't confuse "embedded SQL server" with "embedded SQL". The first is a SQL
server that runs in the same process and address space as the client. The
second is a way of writing SQL statements directly in the source file of
your program and having a preprocessor turn it into function calls to the
client library.
PHP is OK for doing database stuff, but it's kludgey, so if you expect the
system to grow, or other people to work on it, it's probably something to
stay away from unless you need it for some other reason anyway. It's a way
to quickly hack together web front ends, and its use as a desktop tool is poor.
--
Darren New, San Diego CA, USA (PST)
"Ouch ouch ouch!"
"What's wrong? Noodles too hot?"
"No, I have Chopstick Tunnel Syndrome."
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Darren New wrote:
> scott wrote:
>> I just need to revise some SQL now and first of all work out how to
>> create databases and tables from my code.
>
> You do it with SQL.
Incidentally, for history buffs, that was one of the three or four
revolutionary amazing-at-the-time things that the relational model
introduced: the metadata of the data is accessible as data. That hit the
data world the way von Neumann architecture hit the electrical engineering
world. :-)
(The others being a mathematically rigorous foundation, a lack of pointers,
and separation of modeling concerns from efficiency concerns, all of which
were astoundingly revolutionary at the time.)
--
Darren New, San Diego CA, USA (PST)
"Ouch ouch ouch!"
"What's wrong? Noodles too hot?"
"No, I have Chopstick Tunnel Syndrome."
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Darren New wrote:
> Incidentally, for history buffs, that was one of the three or four
> revolutionary amazing-at-the-time things that the relational model
> introduced: the metadata of the data is accessible as data. That hit the
> data world the way von Neumann architecture hit the electrical
> engineering world. :-)
>
> (The others being a mathematically rigorous foundation, a lack of
> pointers, and separation of modeling concerns from efficiency concerns,
> all of which were astoundingly revolutionary at the time.)
The way I heard it, before databases came along, if you had a dozen
programs that all accessed the same pot of data, they all had to
understand the same file format. And if you *changed* that file format,
all your programs broke - usually be silently producing gibberish
instead of real data. Then you'd have to go modify them all one by one
to fix them.
Story goes that one company changed their file format so that some of
their bills to customers contained gibberish which actually leaked
information about their supplier relationships. A competetor got hold of
this information, managed to figure out what the gibberish was, and
managed to gain a competetive advantage.
(Nice story, but sounds kinda made-up to prove a point to me...)
The thing about a database is... the DBMS knows how to read the data.
And if you change how the data is stored (e.g., add a new field, change
a table from heap-organised to index-organised, etc.), as long as the
DBMS still knows how to read it, the client applications don't need to
*care*, and they don't break.
(Hell, if you add new fields or change the type of existing ones, you
can even create a "view" for the old apps to work off. They'll never
know the difference!)
At least, that's the PoV I got. I wasn't there, so...
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Orchid XP v8 wrote:
> The way I heard it, before databases came along, if you had a dozen
> programs that all accessed the same pot of data, they all had to
> understand the same file format. And if you *changed* that file format,
> all your programs broke - usually be silently producing gibberish
> instead of real data. Then you'd have to go modify them all one by one
> to fix them.
Well, yes.
> The thing about a database is... the DBMS knows how to read the data.
> And if you change how the data is stored (e.g., add a new field, change
> a table from heap-organised to index-organised, etc.), as long as the
> DBMS still knows how to read it, the client applications don't need to
> *care*, and they don't break.
Well, no. That's true of RDBMs, which is why they were revolutionary. It
isn't true of database engines that were around before the relational model.
> (Hell, if you add new fields or change the type of existing ones, you
> can even create a "view" for the old apps to work off. They'll never
> know the difference!)
Yes. That was the amazingly cool thing about relational databases that you
didn't get out of CODASYL databases or hierarchical databases or whatever.
--
Darren New, San Diego CA, USA (PST)
"Ouch ouch ouch!"
"What's wrong? Noodles too hot?"
"No, I have Chopstick Tunnel Syndrome."
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> You do it with SQL. Probably something along the lines of
>
> CREATE DATABASE mystuff;
>
> CREATE TABLE mytable (
> somenumber INTEGER NOT NULL,
> comestring VARCHAR(50) DEFAULT '',
> primary key somenumber
> );
>
> Look up those keywords.
OK thanks that's certainly a good start, I've never worked with creating
this stuff from scratch before.
> You'll also need to figure out how to connect in the first place. Grep the
> manual for "connection string", which is usually how it's described. It
> tells basically the address of the server (IP & port, usually) plus user
> plus password.
Got that all working now - at some later date I will need to figure out how
to install postgres silently with all the correct options so my app will run
without having to explain to the user how to install it. And then cope with
what happens when postgres is already installed...
> Warp wrote:
> > If you need to output the data in some special format, after it has
> > been
> > processed, I really don't know what tools, if any, SQL offers for this.
>
> SQL doesn't offer too many tools as such, i.e., not in a portable way.
Most of the "results" I will just be using to display tables or graphs
within my app.
> You could also use metakit (from equi4.com) if you want a nice embeddable
> database that isn't SQL (it's not even normalized). Just a plug for a
> great product.
Will take a look, thanks.
> You can also use "SQL Server Everywhere", which is Microsoft's SQL server
> in a form that you link it into your program.
Ditto.
> PHP is OK for doing database stuff, but it's kludgey, so if you expect the
> system to grow, or other people to work on it, it's probably something to
> stay away from unless you need it for some other reason anyway. It's a way
> to quickly hack together web front ends, and its use as a desktop tool is
> poor.
THere are certain other parts of my code that need to be run pretty quickly
(some simulation stuff) and as I already have a lot of that in C++ I think
my best bet is to find the best database backend to use from C++.
What I might do is write a wrapper class for the database parts of my code
that will initially just use simple STL structures to store/retrieve/filter
data and then later I can write code to use a db engine instead once I get
it working with small datasets.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott wrote:
> What I might do is write a wrapper class for the database parts of my
> code that will initially just use simple STL structures to
> store/retrieve/filter data and then later I can write code to use a db
> engine instead once I get it working with small datasets.
Heh. That's pretty much exactly what the LINQ stuff in the .NET packages
does. An XML file looks like a SQL server at the source code - you just
build a different db connection object at the beginning.
If you're going to be filtering your own values anyway, look into the
non-SQL databases like metakit. The primary advantage of the SQL databases
is you can write expressions in SQL that manipulate an entire table (like
finding the max, min, average and sum of a whole column) in one statement.
If you're going to do that yourself anyway, it's probably easier to have the
code linked into your app than it is to try to install postgress but only if
it's already not installed *or* if you have permission to use it, etc. :-)
--
Darren New, San Diego CA, USA (PST)
"Ouch ouch ouch!"
"What's wrong? Noodles too hot?"
"No, I have Chopstick Tunnel Syndrome."
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|