5. Space provisioning

The Sync Appliance user quota computation mechanism is designed to simplify the administrator’s job, by assigning global quotas to users and making sure no space is left unaccounted for. In other words, version history, snapshots, etc. are internal to the user account and the system strives to ensure that the user will not take up more space than provisioned according to the quota.

There are, however, some elements to keep in mind when considering the storage needs.

5.1. Storage backend properties

Some properties of the storage backend affect the storage needs in ways that make them differ from the precise space attributed to the users. Most of the times, in aggregate these will tend to lower the space required but this depends on the exact workload. It is possible to construct pathological workloads that will exert artificial pressure on the storage backend.

5.1.1. Deduplication and transparent compression

File contents are deduplicated across different users, and compressed automatically by the storage backend when appropriate.

This means that in actual use, the required storage might be less than the sum of the users’ respective quotas.

Within a user account, as far as the quota usage is concerned:

  • multiple files with the same contents within the same project only count once
  • files across different projects with the same contents count once per project

5.1.2. Metadata and alternate representations

Some files, such as images, are processed by the Appliance, and alternate representations of them – such as thumbnails or lower-resolution versions – might be saved. The space taken up by these is not counted against the users’ quotas as of version 1.19.1 .

Note

Such alternate representations are only generated for e.g. very large images, and will take a small fraction (usually around 5%) of the original file size.

5.1.3. Filesystem block size

The local storage backend stores each file as one or more blobs stored in separate files on the host’s filesystem. The end effect is that the actual space taken by a file is a multiple of the filesystem block size (usually 4 KB), which means that files under that size still take at least that much space.

In actual usage with real-life workloads, average file size is well over the block size (usually somewhere in the 50 KB - 1 MB range), so the aggregated lost space is a small fraction of the overall needs, and easily compensated by deduplication and transparent compression.

5.2. User quotas

The Sync Appliance allows two kinds of users:

  • regular users
  • guest users with restricted functionally

Regular users can perform ad-hoc project collaboration including file syncing with external colleagues by inviting them as guest users. Guest users are subject to a number of restrictions – refer to Guest user restrictions – so as to ensure they have no significant effect on the storage needs.

5.2.1. Quota attribution

For regular users, the space taken up by a project is shared amongst all the users with access to it.

Quota usage is computed differently for guest users:

  • only the space needed by their default Æoncase project counts against their quota
  • the space taken by other projects they are given access to counts against the quota of all the regular users that participate in them

5.3. Space recovery

Users can recover space by purging:

  • projects
  • a number of files including all their past revisions
  • a single file revision

When there are no references left to a particular data blob, it is marked for garbage collection (GC). An internal service in the appliance will remove no longer needed data periodically, freeing space in the storage backend.