Technical Note - Disk Quotas

Summary

Student and guest ECS accounts have a disk quota associated with them. This document outlines what the quotas are, what happens when you go over your quota and how to keep your usage below your quota.

Details

Quota Sizes

The current (as of mid-September 2023) default disk quotas on ECS systems are:
  1st Year 2nd Year 3rd Year 4th Year+ Thesis Guests
Disk Quota 1.5 GB 3 GB 6 GB 12 GB 24 GB 8 GB

Although staff accounts have unlimited disk quota we do monitor usage and large increases may result in a polite email asking if the increase was intentional and/or avoidable.

Exceeding Your Quota

When you have exceeded your disk quota you will receive an email each night telling you how much over your quota you are. You should reduce your usage to below your quota within a few days otherwise administrative action, such as temporary suspension of your login account, may be taken. If that happens, to get it reactivated you will need to see one of the school's system administration team to explain your disk usage and why you didn't reduce it.

Disk Quota Exceeded Unexpectedly

You may be surprised to receive an email saying you have exceeded your disk quota before you have started using your ECS login in a particular trimester or academic year. The two most likely explanations for this are:

  • Your account was over its disk quota when you last used it. When your account became suspended we would have stopped sending you warning emails since you can do anything about it when you can't log on to our systems. But once it was reactivated our disk quota system would start complaining to you again.
  • Your account now has a different type with a lower disk quota. Ie: if you were previously enrolled in mostly 2nd year courses but with a single 3rd year course and now you are now only enrolled in 2nd year courses.

A similar situation to the latter could arise within a trimester if you withdraw from all higher level courses resulting in a reduced quota. Then you may find yourself exceeding your quota even though you didn't increase your usage.

Requesting a Larger Quota

In general requests for larger disk quota from students enrolled in courses at 300-level or below will not be granted. If we feel that the current quota limits are unreasonably impacting on those students' use of our systems we will increase the quota for all students.

For students doing 400-level projects or theses, if your work requires a larger disk quota you can ask your supervisor to request this. Similarly, guest users should ask the school staff member who arranged for them to have an ECS account. The request should be made by emailing jobs@ecs.vuw.ac.nz. The school system administrators may wish to talk to you before increasing your quota to ensure that you are using disk resources in a "sensible" way and also to make sure that there are not alternative options (ie: see the following section describing /local/scratch). But if there is a genuine need for a larger quota there should be no problem granting it.

Using /local/scratch on ArchLinux Workstations

All of the ECS ArchLinux workstations have a large area (typically 200GB) of temporary ("scratch") disk available for use by the person using that workstation. Files stored in /local/scratch do not count towards your disk quota so storing files there may be an alternative to requesting a larger disk quota. There are some advantages to doing this, but there are also disadvantages that you should be aware of.

The main advantage is the large amount of free space that is typically available. Also the disk is local to the workstation so accessing it is much (probably several orders of magnitude) faster than a network volume such as the one your home file system is stored on.

Conversely, because the disk is local to each workstation it is only available to programs running on that workstation. That means that if you don't mostly use the same workstation /local/scratch may not be very convenient. Also, be aware that the contents of /local/scratch on our lab workstations may be removed at any time without notice. We would usually check with the owners of files in /local/scratch on workstations in staff or graduate offices before doing anything with them.

A related issue is that even if you do always use the same workstation, you may have data that needs to be processed by a program that is only available on another computer, so /local/scratch may still not be the best option.

A final disadvantage of /local/scratch is that unlike the home file systems kept on our file servers, the contents are not backed up each night. So if a workstation disk was to suffer a hardware fault, or you accidentally removed files you wanted, you would be unable to recover your data. So you should only use /local/scratch for data that you don't mind losing, or that you can easily recreate. Some examples of appropriate /local/scratch usage include:
  • Large read-only data sets, collections of documents or source code archives that can easily be downloaded from the Internet.
  • Large amounts of output from a program that you can easily/quickly run again if you need to reproduce the same output.
  • Uncompressed copies of files that compress well, so you can keep the much smaller compressed copies safely in your home directory but use uncompressed copies in /local/scratch for easy access/analysis.

Tips On Reducing Your Disk Usage

Note that files you have moved to your Wastebin/Trash still count towards your usage; only when you right click on the Wastebin or Trash icon on your desktop and select the "Empty" option are the files deleted. So remember to empty your Trash periodically.

Programs that can help you find where you are using most of your disk quota on our ArchLinux based workstations include the GUI based qdirstat and the command line program du.

The easiest to use is qdirstat, available in the System section of the K menu. It shows you visually how much space your files and directories are using and also provides a convenient user interface that allows you to delete selected files/directories. On the Windows servers (somes and ward) and our Windows 10 Desktop machines the equivalent command is Windirstat.

Note that one issue with (at least the UNIX version of) this tool is that it counts the exact number of bytes that each file uses rather than the number of disk blocks (which on our systems are 4KB or 4096 bytes). So a file that is 4097 bytes in size is actually taking up 8192 bytes (2 x 4096) on disk. Thus the overall total reported by qdirstat may be an underestimate of your total usage. For this reason, or simply because you prefer command line tools to the graphical interface of qdirstat, from a shell window you can type du -d1 -m in any of your directories and you will see the exact size (in megabytes) of each file and the total for each directory of all the files/directories within it. You can consult the du man page for more details.

If you omit the -d1 option the command lists the total size of all files and directories under the in the current directory recursively. This will produce much more output but it provides information on every directory in an entire directory tree rather than just the summary of the top-level files and directories.

Running du as above may reveal files or directories that start with "." that contain large amounts of disk. Such files/directories are usually hidden from you by the normal file browsing/directory listing tools because they contain system configuration information that you normally wouldn't want to see. You should be cautious about removing hidden files or any files contained in a hidden directory unless you understand their purpose. Without them parts of the system may not work correctly. One exception to this rule on our KDE systems is .local/share/Trash, which is where files you have "deleted" via the dolphin file manager are put. As long as you are sure you won't want to recover previously deleted files it should be safe to "empty your trash" as described above or by using the rm command line tool.

When using du to see where your usage is, you would typically run it first in your top-level directory. You could run du -d1 -m | sort -nr | head to get a list of your top 10 disk usage consumers. You would then change directory into each of the listed directories that you wanted to investigate further and repeat the process until you have found why that directory is so large.

Tips for users of Eclipse

Eclipse is a disk hog. Here are some things to do to reduce its usage.

  1. Don't use multiple workspaces - rather use multiple projects within the one workspace. There is a whole lot of metadata overhead per workspace and no particular gain.
  2. Occasionally run eclipse -clean. This cleans up a bunch of old cached info at the expense of a slightly longer startup time while its doing it.
  3. Remove old profiles. eclipse writes multiple profiles into
    ~/.eclipse/org.eclipse.platform_*/p2/org.eclipse.equinox.p2.engine/profileRegistry/
    and doesn't delete old ones. You only need the most recent file in that directory - delete the others (this can get lots of space back)
  4. Reduce the values set in Preferences > General > Workspace > Local History. Once there, you can find three options, Days to keep files, maximum entires per file and maximum file size. With the default settings you can end up keeping many copies of every file you edit, when typically you aren't going to reference any but the most recent backups.