Hostwinds Tutorials

Search results for:


Table of Contents


Why is my Object Storage larger than my data?
In summary:
Why is Object Storage Smaller Than My Data?
Related Web Hosting Tutorials:

Why is my Object Storage bigger (or smaller) than my data?

Tags: Cloud Servers,  Web Hosting 

Why is my Object Storage larger than my data?
In summary:
Why is Object Storage Smaller Than My Data?
Related Web Hosting Tutorials:

If you have the Hostwinds Cloud Backup service, it shouldn't be more than a few days before you see that the Object Storage being used is a different size than the data being backed up.

Why is my Object Storage larger than my data?

For Shared and Business hosting, this is easily explained: each daily backup is a copy of the entire cPanel account. If you have a Shared account with 100MB of website/email/database, your Cloud Backups will grow by 100MB each day until your retention limit is hit, then it would sit at 100MB x Days stored. You can adjust the number of days stored if you would rather pay less for storage and don't think you'll need as many backups.

For a VPS or Dedicated Server, however, the answers are a little more complicated. The backup software (restic) doesn't make a complete backup of the server each day. Still, the backups captured are not the same as a traditional "full backup weekly, incremental backup daily" system one might be familiar with. Restic takes backups in 'snapshots' each day but only stores de-duplicated data. If the oldest backup is older than the retention period (the default is 60 days and used as an example for the rest of this article), it removes and purges the oldest snapshot. This is not the same as deleting the 'oldest full backup', but rather it just throws away the records of changes to the files before 60 days ago.

For example, if you have "today.txt" that is automatically updated with today's date each day, restic will have 60 copies of it stored. When the oldest snapshot is removed, it will throw away the previous versions but still allow you to restore the file to any snapshot in the last 60 days. Likewise, if you have "start.txt" that records the server's date and never changes, it will be kept, and restoring it from any snapshot will give the same data.

If you have a large database of products that's not updated often, it won't contribute much more to the backups than the size of the database. On the other hand, if you have a database of users, forum posts, etc., that changes daily/hourly/every minute -- this kind of database will contribute greatly to the size of a restic backup in Object Storage, even if the overall database size doesn't grow quickly.

Let's take a look at a real server. These examples are for Linux, but the ideas are the same for Windows. One big difference with Windows is that it takes several snapshots per day, one for each directory in C:\, so pay attention to the date of the snapshots in Windows and not the total number of them.

Here we have a fresh Linux VPS with 1.5GB used in the storage:

After taking the first backup, Object Storage shows about the same 1.5GB:

What happens if we add about 1.1GB of data and run a new backup?

Don't worry about the OpenSSL command. It's just an easy way to generate a random file we can easily edit later.

The object storage has grown by about 1.1GB:

Let's make a simple edit to the file, replacing some of the text at the beginning(but not changing the file size):

A new backup doesn't take up much more space because we only made one small change. Restic breaks files into 'blobs' between 512KB and 8MB, so it only has to store one more 'blob' for this difference.

A more complicated edit, replacing all 'QQ' in the file with 'zz' will cause a lot more new blobs to be stored, however:

The file is the same size

But the backup size has grown significantly.

This changed about 250,000 of the 16 million lines in the file, but even a 1.5% change in data spread out through the whole file will greatly contribute to the number of blobs restic has to store for the change.

And, of course, deleting the file frees up a lot of space on the drive.

But a fresh backup doesn't shrink the Object Storage size. Obviously, one of the big reasons to have backups is to recover from accidental (or malicious) deletion of data.

We can manually 'forget' a snapshot and 'prune' the data associated with it. For example, this is a snapshot that had one of the versions of the 1.1GB file.

And the backup storage size shrinks appropriately:

The Hostwinds Cloud Backup scripts will automatically 'forget' and 'prune' each time it's run, keeping one snapshot per day for the last number of days specified in /root/.restic_var or C:\Windows\System32\restic_repo.ps1.

In summary:

Action

VPS storage size

Object Storage Size

Initial

1.5GB

1.421GB

1.1GB file generated

2.6GB

2.512GB

Single line changed

2.6GB

2.513GB

"QQ" -> "zz"

2.6GB

3.604GB

1.1GB file deleted

1.5GB

3.604GB

snapshot deleted

1.5GB

2.513GB

While small changes won't necessarily contribute to extra backup space used, lots of small changes and, of course, big changes will greatly affect the amount stored.

Why is Object Storage Smaller Than My Data?

There are occasions where the storage on the disk may be larger than the backup data. Our backup scripts automatically exclude directories like /tmp and /var/tmp in Linux and the Recycle Bin in Windows. If you 'delete' a file in Windows and it goes to the recycle bin, then don't empty the Recycle Bin for 60 days. As a result, your Object Storage may be smaller than the space used in the c:\ drive.

I've placed a 260MB version of the sample.txt in /tmp in Linux, then run a backup:

Simply, the backup is smaller than the space used because not all directories are backed up.

The excluded directories in Linux are:

/dev,/media,/mnt,/proc,/run,/sys,/tmp,/var/tmp,/var/log,/backup,/home/virtfs

And in Windows, restic backs up non-hidden directories that are 'ClientAccessable', so directories like c:\$Recycler and files like c:\pagefile.sys don't get backed up.

Hopefully, this helps explain the discrepancies in your data vs. the size of the backups.

Related Web Hosting Tutorials:

Written by Hostwinds Team  /  June 5, 2021