CRC Storage Resources


The following table gives an overview of the available file system resources on the CRC Linux Clusters.

Purpose 

Segment
of the Linux
Cluster

File system type
and
full name

How the user should access the files

Space Available

Approx. aggregated 
bandwidth 

Backup

Lifetime and deletion strategy.
Remarks

 Globally accessible Home and Project Directories

User's Home Directories

all


AFS - crc.nd.edu
/afs/crc.nd.edu/user/first/netid



 
$HOME




100 GB - 2 TB volume


up to 70 - 85  MB/sec per node - Approximately 200 MB/sec aggregated using multiple nodes

In general daily backup by CRC - currently one year retention limit - online backup volumes can be easily accessed within a day (YESTRDAY)

Expiration of affiliation with Notre Dame

AFS quotas apply

Best performance by running jobs on multiple nodes in separate directories due to callbacks of AFS

Refer to CRC storage policy for Tier 1, 2, 3 allocation policy

Group Directories

all     


AFS crc.nd.edu
/afs/crc.nd.edu/group/

Directly

100 GB - 10 TB volume

up to 70 - 85  MB/sec

In general daily backup by CRC - currently one year retention limit - online backup volumes can be easily accessed within a day (YESTRDAY)

Expiration of affiliation with Notre Dame

AFS quotas apply

Refer to CRC storage policy for Tier 1, 2, 3 allocation policy

 Pseudo-temporary File Systems
Please use the scratch area that is most appropriate for the system you work on

Panasas High Performance Parallel scratch
/pscratch file system

Departmental volume available at request -

all


/pscratch/netid
Directly using either NFS gateway (panasas-nfs)  or proprietary Panasas panfs client

500 GB -  10 TB
volume

 60 - 70  MB/sec per node - Approximately 3000 MB/sec aggregated using multiple nodes (50)

None

Currently: By request as filesystem hits high watermark

In future: Sliding window file deletion - No guarantee for data integrity.


Refer to CRC storage policy for Tier 1, 2, 3 allocation policy

Focus is on performance not reliability

Local File Systems

node-local temporary /scratch
filesystem

all 

local disks
/scratch a link to /tmp

Directly - shared with other users on node.

X2100 - dcopt
SC1435 - Maginn
40 GB

X2200 - ddcopt
100 GB

HP DL160 - d6copt
100 GB
SC1435 - Maginn
30 - 40 MB/sec

Sun X2200 - ddcopt
100-120 MB/sec

HP DL160 -d6copt
50 - 60 MB/sec

None

Batch Nodes: files deleted if older than 4 weeks - Users encouraged to remove files at end of run. /tmp cleared at reboot.

Login Nodes: files are removed if older than 4 weeks

Users may find contention with inconsiderate users





CRC User AFS file system and quotas

Quotas for the $HOME  may be found by using the command "quota" (fs listquota - may report negative values due to 32 bit wrap on fileservers)  Jobs may die due to insufficient space if quota is exceeded.  It is possible to exceed quota with one file being written.



/pscratch file system and quotas
Quotas for the /pscratch/netid may be found by emailing a request from CRCsupport@nd.edu
CRC has requested user support for this funtionality - Users are sent email to their netid@nd.edu address when their "soft" quota has been reached (450 GB by default).  Writing can continue until their "hard" quota is reached (500 GB by default).  Email will be sent when hard quota is reached - jobs may die due to insufficient space.






To prevent overflow of the large scale storage areas, In the future CRC may implemented various deletion strategies.
Currently the CRC uses a manual high watermark deletion method - In the future the CRC might use a sliding window
file deletion.  Please note that users will be notified in advance of possible changes.



File deletion and data integrity issues
  • for a given file or directory, the exact time of deletion is unpredictable!
  • the normal tar -x command preserves the modification time of the original file instead of the time when the archive is unpacked. So unpacked files may become one of the first candidates for deletion. Use tar -mx if required, or perform touch on a file or
     
    find mydir -exec touch {} \;
    on a directory tree mydir.
Due to the deletion strategies described in the subsections below, but also due to the fact that CRC cannot guarantee the same level of data integrity for the high performance file system as compared to $HOME, The CRC urges you to copy transfer or archive your files from pseudo-temporary disks as well as the areas to safe home and/or group directory area
High watermark deletion

When the filling of the file system exceeds  some limit (typically between 70% and 75%), files will be deleted starting with the oldest and largest files until a filling of between 60% and 75% is reached. The precise values may vary.

Sliding window file deletion

Any files and directories older than typically 180 days (the interval may be shortened if the fill-up rate becomes very high) are removed from the disk area. This deletion mechanism is invoked once a day.