|
|
I have to solve this at work.
- There are N individuals. N is unknown (thousands).
- There are M groups. M is known (hundreds).
- All individuals belong at least to 1 group, at most to K groups. K is
known (tens).
- The exact number of individuals L for each group is known. It ranges from
1 to several hundreds.
- The proportion P of how many individuals belong to a certain number of
groups is known. For instance, we know that 50% of the individuals belong to
only one group, 20% belong to 2 groups and so on.
- We have sampled the population for 12 of the larger groups and found that
the ratio of [actual size of the population belonging to the 12 groups]/[sum
of the population sizes (L) for the 12 groups] to be 80%.
With all the parameters above, it possible to estimate the size of N?
Of course, the maximum estimate is sum(L). Empirically, I'd say that a
better estimate is 0.8 x sum(L), but I have the (also empirical) feeling
that the actual number is much, much lower.
(The real-life problem deals with bibliometry: individuals are scientific
articles and groups are keywords.)
G.
Post a reply to this message
|
|