123456789101112131415161718192021222324252627282930313233343536373839404142434445 |
- There are two main components of a filer: directories and files.
- My previous approach was to use some sequance number to generate directoryId.
- However, this is not scalable. The id generation itself is a bottleneck.
- It needs careful locking and deduplication checking to get a directoryId.
- In a second design, each directory is deterministically mapped to UUID version 3,
- which uses MD5 to map a tuple of <uuid, name> to a version 3 UUID.
- However, this UUID3 approach is logically the same as storing the full path.
- Storing the full path is the simplest design.
- separator is a special byte, 0x00.
- When writing a file:
- <file parent full path, separator, file name> => fildId, file properties
- For folders:
- The filer breaks the directory path into folders.
- for each folder:
- if it is not in cache:
- check whether the folder is created in the KVS, if not:
- set <folder parent full path, separator, folder name> => directory properties
- if no permission for the folder:
- break
-
- The filer caches the most recently used folder permissions with a TTL.
- So any folder permission change needs to wait TTL interval to take effect.
- When listing the directory:
- prefix scan of using (the folder full path + separator) as the prefix
- The downside:
- 1. Rename a folder will need to recursively process all sub folders and files.
- 2. Move a folder will need to recursively process all sub folders and files.
- So these operations are not allowed if the folder is not empty.
- Allowing:
- 1. Rename a file
- 2. Move a file to a different folder
- 3. Delete an empty folder
|