|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott wrote:
> So I take it using SELECT DISTINCT X ... is better than using SELECT X
> ... GROUP BY X for what I want to do?
I don't know. I'd say it's more "correct" in some abstract theoretical
sense, since the only reason you're grouping is to get unique rows. And it
works in the case you're trying. :-)
If you wanted to know how many different types of events each experiment
yielded, you'd do something like
select expid, count(eventid) from exp group by expid
or something like that. (Not gonna fire up SQL to try it.)
But you see what I'm saying there.
--
Darren New, San Diego CA, USA (PST)
My fortune cookie said, "You will soon be
unable to read this, even at arm's length."
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott wrote:
> When I call COUNT(*) FROM (SELECT X ... GROUP BY X) on my test data set
> (about 100k eventID rows) it runs in 0.25s. That's a bit slow, but hmm ok.
>
> If I change the SQL statement to COUNT(*) FROM (SELECT DISTINCT X ... )
> it takes 11.27s. Huh?
I'm guessing when you do a SELECT DISTINCT, it's internally creating a
temporary table.
> Also, with the DISTINCT method, as I repeatedly execute the query the
> database size seems to have grown from 7MB to 14MB then to 31MB... HUH?
Do you have an option to compact the database again and see if it shrinks?
Did you perchance close the connection to the database between queries so it
frees up temporary tables? (I'm just guessing here.)
Do these questions not even make sense for the compact edition? :-)
--
Darren New, San Diego CA, USA (PST)
My fortune cookie said, "You will soon be
unable to read this, even at arm's length."
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> Do you have an option to compact the database again and see if it shrinks?
> Did you perchance close the connection to the database between queries so
> it frees up temporary tables? (I'm just guessing here.)
>
> Do these questions not even make sense for the compact edition? :-)
Yes they make sense, but I haven't tried that because the speed penalty
seems bad anyway so won't use the distinct method for now.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> I don't know. I'd say it's more "correct" in some abstract theoretical
> sense, since the only reason you're grouping is to get unique rows. And it
> works in the case you're trying. :-)
>
> If you wanted to know how many different types of events each experiment
> yielded, you'd do something like
> select expid, count(eventid) from exp group by expid
> or something like that. (Not gonna fire up SQL to try it.)
>
> But you see what I'm saying there.
Yeh I see what you're saying, I'll have to work on importing more realistic
data as my test data is a bit artificial. But the speed so far seems fine
and I'm pretty confident that SQL CE will be capable of doing everything I
need. If I had known it was this easy to set up and work with .net I might
have used it for some other projects too rather than struggling with
filtering and sorting arrays and other data containers!
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott wrote:
> Yes they make sense, but I haven't tried that because the speed penalty
> seems bad anyway so won't use the distinct method for now.
What indexes have you defined? If you define an index over expid, you may
see a speed increase. 11 seconds seems a bit long for any query on a table
with indexes.
--
Darren New, San Diego CA, USA (PST)
My fortune cookie said, "You will soon be
unable to read this, even at arm's length."
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> What indexes have you defined? If you define an index over expid, you may
> see a speed increase. 11 seconds seems a bit long for any query on a table
> with indexes.
I had an index on expid already, but looking at it again I think the test
case I used was artificial because almost all of the data was from just one
expid. Using the query for this one, where 99% of records were for this ID,
it took the 11 seconds, but when querying for other expids (where they took
up less than 1% of the table) was way way faster, like 0.1 seconds or
something. I guess if the intermediate result somewhere consists of almost
the whole table it causes some slowdown...
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott wrote:
> I guess if the intermediate result somewhere
> consists of almost the whole table it causes some slowdown...
That it will do, yes! I suspect "unique" and "group by" are doing things
differently, with "group by" accumulating results and "unique"
post-processing them.
You definitely want a realistic data set before you worry about speed.
--
Darren New, San Diego CA, USA (PST)
My fortune cookie said, "You will soon be
unable to read this, even at arm's length."
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |