|  |  | Orchid XP v8 schrieb:
> Now I have a vague recollection of somebody at uni telling us that ext2 
> works by storing 4096 file entries in an inode. Once you have more files 
> than that, the inode stores a pointer to another inode. This inode 
> contains not file pointers, but inode pointers. So at this level of 
> indirection, you can have up to 4096 * 4096 files.
Not precisely.
Even in a virtually empty directory (or file, for that matter), the 
actual data is not stored in the inode itself; instead, in the simplest
case an inode will refer to up to a dozen of data blocks, which in turn 
store the actual data. Thus, data is limited to e.g. 48 kB on a file 
system with 4 kB data blocks.
For the many cases where this should be insufficient, three more data 
block pointers are available, which are treated specially:
If required, the first of the three special pointers is used to 
reference a data block holding an additional list of pointers to actual 
data blocks - i.e. it would reference quite a bunch of additional data 
blocks via one level of indirection. Sticking to the 4 kB data block 
example, that would give another 1024 data blocks (1036 in total), for a 
bit above 16 MB of data in total. (Note that the 12 direct data block 
pointers are still used in this case.)
If even more data blocks are required, the second special pointer is 
used in a similar way, except that it references even more data blocks 
via two levels of indirections, adding another 1024x1024 data blocks for 
a bit above 64 GB of data in total. Ultimately, the third special block 
pointer would also be used in pretty much the same way, except that it 
would use three levels of indirection, for some 2 TB of data in total.
(Note that the data blocks holding the indirection tables are not 
inodes; that would be a waste of resources, as inodes (a) store a host 
of other information aside from the data block pointers, and (b) are a 
rather precious resource in themselves, as only a fixed number of them 
exists.)
Obviously, the math you heard at university doesn't really work out, as 
it ignores that inodes and indirection blocks have different capacity of 
block pointers. Even more, block size may vary from 1kB to 8kB, and 
directory entry sizes even vary at run-time depending on filename length.
On a sample directory on my Linux system, I figured that one data block 
(4 kB) contains roughly 75 directory entries.
With direct blocks only, this would give me a bit short of 1000 entries.
One level of indirection would give me 1 million entries.
Second level gives me 1 billion.
Third level gives me 1 trillion.
File operations will be bogged down long before however, as the 
directory entries are stored within the data blocks as an unordered 
linked list, which needs to be traversed for each single file lookup.
Not to mention that with so many files, the directory information alone 
would occopy 1/4 of the maximum file system capacity of ext2 (at a block 
size of 4 kB), which does not leave much room for actual data per file :-P
 > I vaguely recall that
> if you exhaust this, it goes to a third level of indirection. But I 
> can't remember whether it stops there.
Definitely so. Though it's probably not much of an issue in practice :-) 
For a block size of 8kB, that would actually be sufficient to reference 
more blocks than the file system as a whole can handle :-)
> Of course, given the source this information came from, it could be 
> completely bogus. ;-)
Well, it's not too far off the mark. Unless that person is expected to 
train you to become Linux Gurus :-)
 Post a reply to this message
 |  |