|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
To all Linux gurus here: can anybody tell me how many files can be stored in
a Linux directory without performance degradation? Or is there no limit for
directory entries on Linux file systems?
On a web server I have about a thousand product shots in a single directory
(web shop demands it) - how many more before Linux stops to like me? ;-)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
TC wrote:
> On a web server I have about a thousand product shots in a single directory
> (web shop demands it) - how many more before Linux stops to like me? ;-)
That depends more on the file system than the kernel. If you're using ext2
or ext3 (I think both have it) there's a flag you can turn on that says to
build extra indexes for directories. Otherwise, I think they're linear.
--
Darren New, San Diego CA, USA (PST)
I ordered stamps from Zazzle that read "Place Stamp Here".
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Well, I use ext2. If I understand ext2 correctly, you may have 32k
subdirectories before linux throws the towel. Would it crash? Or be graceful
about it? I surely will not try it out...
However, I cannot find anything about the number of files I can store in a
single directory. I assume that this number is limited by diskspace only.
But it is no good to assume. And since I know no linux guru but am pretty
sure here are quite a lot of them to be found, I asked the question. I would
hate to delve through tons of technical documentation.
I did not know about the indexing flag, though, thank you. Maybe I'll find
more on it.
It's always a surprise what can or cannot be done with or to a filesystem if
you take a closer look. I really hate the ADS on Windows NTFS, for
instance - I find this an abomination. ;-)
"Darren New" <dne### [at] sanrrcom> schrieb im Newsbeitrag
news:4ab025ff$1@news.povray.org...
> TC wrote:
>> On a web server I have about a thousand product shots in a single
>> directory (web shop demands it) - how many more before Linux stops to
>> like me? ;-)
>
> That depends more on the file system than the kernel. If you're using ext2
> or ext3 (I think both have it) there's a flag you can turn on that says to
> build extra indexes for directories. Otherwise, I think they're linear.
>
> --
> Darren New, San Diego CA, USA (PST)
> I ordered stamps from Zazzle that read "Place Stamp Here".
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
TC wrote:
> Well, I use ext2. If I understand ext2 correctly, you may have 32k
> subdirectories before linux throws the towel.
I never heard of that. It's perhaps an artifact of how many links you can
have to one file, since ".." in a subdirectory links back to the parent
directory.
> Would it crash? Or be graceful
> about it? I surely will not try it out...
Now you have me curious enough to try it. ;-)
> However, I cannot find anything about the number of files I can store in a
> single directory. I assume that this number is limited by diskspace only.
Most likely, since I think ext is in many ways similar to the original v7
directory layout that BSD replaces the API for.
> But it is no good to assume. And since I know no linux guru but am pretty
> sure here are quite a lot of them to be found, I asked the question. I would
> hate to delve through tons of technical documentation.
Every time I've been foolish enough to build a system like that, I've put a
layer of subdirectories between, so that file abcdefghijk.txt would be
stored in /stuff/abc/def/ghi/jk.txt or some such.
> I did not know about the indexing flag, though, thank you. Maybe I'll find
> more on it.
I found it thru yast, but I'm sure you can turn it on and off with ext2tune
or whatever it's called. Apparently, "-O dir_index" passed to mkfs.ext3 will
do the trick, but I think you can turn it on with an fsck as well.
> It's always a surprise what can or cannot be done with or to a filesystem if
> you take a closer look. I really hate the ADS on Windows NTFS, for
> instance - I find this an abomination. ;-)
Whyfor? Every file system nowadays has something like this, including Linux.
And it's pretty useful and a logical extension of how NTFS organizes files
anyway. What don't you like about it?
--
Darren New, San Diego CA, USA (PST)
I ordered stamps from Zazzle that read "Place Stamp Here".
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
TC wrote:
> To all Linux gurus here: can anybody tell me how many files can be stored in
> a Linux directory without performance degradation? Or is there no limit for
> directory entries on Linux file systems?
Now I have a vague recollection of somebody at uni telling us that ext2
works by storing 4096 file entries in an inode. Once you have more files
than that, the inode stores a pointer to another inode. This inode
contains not file pointers, but inode pointers. So at this level of
indirection, you can have up to 4096 * 4096 files. I vaguely recall that
if you exhaust this, it goes to a third level of indirection. But I
can't remember whether it stops there.
Of course, given the source this information came from, it could be
completely bogus. ;-)
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
TC schrieb:
> Well, I use ext2. If I understand ext2 correctly, you may have 32k
> subdirectories before linux throws the towel. Would it crash? Or be graceful
> about it? I surely will not try it out...
> However, I cannot find anything about the number of files I can store in a
> single directory. I assume that this number is limited by diskspace only.
WIYF (http://en.wikipedia.org/wiki/Ext2#File_system_limits):
"If the number of files in a directory exceeds 10000 to 15000 files, the
user will normally be warned that operations can last for a long time
unless directory indexing is enabled. The theoretical limit on the
relevant for practical situations."
You may also want to check out info on the tune2fs utility, which I'd
expect the thing to use if you wanted to enable directory indexing. But
of course you'd need full control over the file system for that, so if
your shop is hosted on a shared server, you may be out of luck.
> It's always a surprise what can or cannot be done with or to a filesystem if
> you take a closer look. I really hate the ADS on Windows NTFS, for
> instance - I find this an abomination. ;-)
I think the main problem with those is that other file systems don't
have them. They're a pretty neat idea to tag files with additional
information.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Orchid XP v8 schrieb:
> Now I have a vague recollection of somebody at uni telling us that ext2
> works by storing 4096 file entries in an inode. Once you have more files
> than that, the inode stores a pointer to another inode. This inode
> contains not file pointers, but inode pointers. So at this level of
> indirection, you can have up to 4096 * 4096 files.
Not precisely.
Even in a virtually empty directory (or file, for that matter), the
actual data is not stored in the inode itself; instead, in the simplest
case an inode will refer to up to a dozen of data blocks, which in turn
store the actual data. Thus, data is limited to e.g. 48 kB on a file
system with 4 kB data blocks.
For the many cases where this should be insufficient, three more data
block pointers are available, which are treated specially:
If required, the first of the three special pointers is used to
reference a data block holding an additional list of pointers to actual
data blocks - i.e. it would reference quite a bunch of additional data
blocks via one level of indirection. Sticking to the 4 kB data block
example, that would give another 1024 data blocks (1036 in total), for a
bit above 16 MB of data in total. (Note that the 12 direct data block
pointers are still used in this case.)
If even more data blocks are required, the second special pointer is
used in a similar way, except that it references even more data blocks
via two levels of indirections, adding another 1024x1024 data blocks for
a bit above 64 GB of data in total. Ultimately, the third special block
pointer would also be used in pretty much the same way, except that it
would use three levels of indirection, for some 2 TB of data in total.
(Note that the data blocks holding the indirection tables are not
inodes; that would be a waste of resources, as inodes (a) store a host
of other information aside from the data block pointers, and (b) are a
rather precious resource in themselves, as only a fixed number of them
exists.)
Obviously, the math you heard at university doesn't really work out, as
it ignores that inodes and indirection blocks have different capacity of
block pointers. Even more, block size may vary from 1kB to 8kB, and
directory entry sizes even vary at run-time depending on filename length.
On a sample directory on my Linux system, I figured that one data block
(4 kB) contains roughly 75 directory entries.
With direct blocks only, this would give me a bit short of 1000 entries.
One level of indirection would give me 1 million entries.
Second level gives me 1 billion.
Third level gives me 1 trillion.
File operations will be bogged down long before however, as the
directory entries are stored within the data blocks as an unordered
linked list, which needs to be traversed for each single file lookup.
Not to mention that with so many files, the directory information alone
would occopy 1/4 of the maximum file system capacity of ext2 (at a block
size of 4 kB), which does not leave much room for actual data per file :-P
> I vaguely recall that
> if you exhaust this, it goes to a third level of indirection. But I
> can't remember whether it stops there.
Definitely so. Though it's probably not much of an issue in practice :-)
For a block size of 8kB, that would actually be sufficient to reference
more blocks than the file system as a whole can handle :-)
> Of course, given the source this information came from, it could be
> completely bogus. ;-)
Well, it's not too far off the mark. Unless that person is expected to
train you to become Linux Gurus :-)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
clipka wrote:
> "If the number of files in a directory exceeds 10000 to 15000 files, the
> user will normally be warned that operations can last for a long time
> unless directory indexing is enabled.
I hate to mention this, but if you put 100,000 files in a directory and then
delete them all, operations on the directory will still be slow. An 'ls' on
the empty directory, or even an rmdir, can take several minutes as the
machine scans thru the directory making sure it is indeed empty before
deleting it.
> I think the main problem with those is that other file systems don't
> have them.
Nonsense. Every modern operating system has them. Macs have had them since
400K floppies were the norm. NTFS has always (as far as I remember) had
them. Newer Linux file systems have them (altho IIRC they're sometimes
organized more as tag/value pairs) - JFS, XFS, Reiser, ZFS, etc. They call
them "Extended attributes" under Linux. Not surprisingly, they're used
similarly to how NTFS uses the streams. (Huh. According to wiki, even FAT
supports them if you use the right kind on NT.)
The biggest problem is that POSIX doesn't support them, so implementors
aren't sure how to build a non-proprietary interface to them that will be
accepted.
http://en.wikipedia.org/wiki/Extended_file_attributes
--
Darren New, San Diego CA, USA (PST)
I ordered stamps from Zazzle that read "Place Stamp Here".
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Orchid XP v8 wrote:
> TC wrote:
>> To all Linux gurus here: can anybody tell me how many files can be
>> stored in a Linux directory without performance degradation? Or is
>> there no limit for directory entries on Linux file systems?
>
> Now I have a vague recollection of somebody at uni telling us that ext2
> works by storing 4096 file entries in an inode.
Heh. You're confusing free space, used space, and i-nodes (which are
descriptors for files).
> Of course, given the source this information came from, it could be
> completely bogus. ;-)
It is.
--
Darren New, San Diego CA, USA (PST)
I ordered stamps from Zazzle that read "Place Stamp Here".
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>> Of course, given the source this information came from, it could be
>> completely bogus. ;-)
>
> Well, it's not too far off the mark. Unless that person is expected to
> train you to become Linux Gurus :-)
No. It was a first course in filesystems. I imagine the guy picked est2
because it was easy to look up the reference material. (We never, ever
used anything that actually had est2 on it...)
While we're on the subject... NTFS has an optimisation where "small"
files are stored in the same block as the directory entry. (Saves
seeking and wasting half a disk block.) Does est2 have any optimisations
for small files?
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|