design.txt 1.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445
  1. There are two main components of a filer: directories and files.
  2. My previous approach was to use some sequance number to generate directoryId.
  3. However, this is not scalable. The id generation itself is a bottleneck.
  4. It needs careful locking and deduplication checking to get a directoryId.
  5. In a second design, each directory is deterministically mapped to UUID version 3,
  6. which uses MD5 to map a tuple of <uuid, name> to a version 3 UUID.
  7. However, this UUID3 approach is logically the same as storing the full path.
  8. Storing the full path is the simplest design.
  9. separator is a special byte, 0x00.
  10. When writing a file:
  11. <file parent full path, separator, file name> => fildId, file properties
  12. For folders:
  13. The filer breaks the directory path into folders.
  14. for each folder:
  15. if it is not in cache:
  16. check whether the folder is created in the KVS, if not:
  17. set <folder parent full path, separator, folder name> => directory properties
  18. if no permission for the folder:
  19. break
  20. The filer caches the most recently used folder permissions with a TTL.
  21. So any folder permission change needs to wait TTL interval to take effect.
  22. When listing the directory:
  23. prefix scan of using (the folder full path + separator) as the prefix
  24. The downside:
  25. 1. Rename a folder will need to recursively process all sub folders and files.
  26. 2. Move a folder will need to recursively process all sub folders and files.
  27. So these operations are not allowed if the folder is not empty.
  28. Allowing:
  29. 1. Rename a file
  30. 2. Move a file to a different folder
  31. 3. Delete an empty folder