What's a good file structure for a voluminous file set?

Each file will be small - such as a youtube @ such as rBsjnucePnk coupled with an IP Address. I project to have a million such little files and I'm curious how to structure the file system?

I'm guessing it would be good to have a subdirectory sub system with ranges that drop down and further subdirectories with ranges. How many subdirectory sets should I have and what kind of number of set in a subdirectory?

Thank you.

1 Reply

How big are the files? Is the content fairly static (infrequent updates)…so largely read-only?

If they are very small (like, say, 4Kb or so), you might be better off not using the file system at all. The file system has increasing overhead with directory tree depth.

It might be better to store the file content in a database using the file name as the database key…and focusing your efforts on optimizing your software for search (partial keys, regexes on both key/value, etc).

There are two options for doing this that come to mind:

Of the two, I prefer SQLite3 because it has an SQL interface which makes software development a snap. I’ve also used it a lot more than BDB.

Both are free & well supported. SQLite is also cross-platform so if you need to move your file elsewhere (another system where SQLite is supported; e.g., Mac or Windows), you can do that. Since BDB relies on hardware/machine-architecture information to increase it's performance, it's files are not portable…you have to do a dump/restore to move your file to a new machine.

Performance might become an issue for you, however. You can mitigate that by developing an in-memory cache using redis or some such…

— sw

P.S., you can use MySQL for this too but you'll introduce the overhead of running a database server on your Linode or remotely…and the hassle in the management thereof. If this solution appeals to you, you might look into Linode's managed database product.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct