ssh bracken@login.rc.colorado.edu
Your Password is: PPPPDDDDDD (PPPP = your pin, DDDDDD = OTP number)
https://www.rc.colorado.edu/services/storage/filesystemstorage
~
) - limited storage but snapshotted regularly, so you can use this for code /projects/$USER
) - 256 GB storage per user, snapshotted regularly, use this for storing data/lustre/janus_scratch/$USER
) - Intended for parallel IO from jobs, not for long term storage. Has some usage restrictions.Add the following line to ~/.my.bash_profile
or type it in the prompt
module load slurm
Make a work directory
mkdir testing
cd testing
Test job script, copy all the folowing into the command prompt:
#!/bin/bash
# Lines starting with #SBATCH are treated by bash as comments, but interpreted by slurm
# as arguments.
#
# Set the name of the job
#SBATCH -J test_job
#
# Set a walltime for the job. The time format is HH:MM:SS - In this case we run for 5 minutes.
#SBATCH --time=0:05:00
#
# Select one node
#
#SBATCH -N 1
# Select one task per node (similar to one processor per node)
#SBATCH --ntasks-per-node 1
# Set output file name with job number
#SBATCH -o testjob-%j.out
# Use the janus-debug QOS
#SBATCH --qos=janus-debug
# The following commands will be executed when this script is run.
echo The job has begun
echo Wait one minute...
sleep 60
echo Wait a second minute...
sleep 60
echo Wait a third minute...
sleep 60
echo Enough waiting. Job completed.
# End of example job shell script
Submit the job:
sbatch testjob_submit.sh
Check on the job:
squeue -u $USER
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
183359 janus get_clus bracken R 0:04 2 node[0433-0434]
scontrol show job <jobid>
UserId=bracken(1000397) GroupId=brackenpgrp(1000397)
Priority=602 Nice=0 Account=ucb00000307 QOS=janus-debug
JobState=COMPLETING Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0
RunTime=00:00:10 TimeLimit=00:01:00 TimeMin=N/A
SubmitTime=2014-09-11T13:16:54 EligibleTime=2014-09-11T13:16:54
StartTime=2014-09-11T13:16:55 EndTime=2014-09-11T13:17:05
PreemptTime=None SuspendTime=None SecsPreSuspend=0
Partition=janus AllocNode:Sid=janus-compile3:17163
ReqNodeList=(null) ExcNodeList=(null)
NodeList=node[0433-0434]
BatchHost=node0433
NumNodes=2 NumCPUs=24 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
Socks/Node=* NtasksPerN:B:S:C=12:0:*:* CoreSpec=0
MinCPUsNode=12 MinMemoryNode=0 MinTmpDiskNode=0
Features=(null) Gres=(null) Reservation=(null)
Shared=0 Contiguous=0 Licenses=(null) Network=(null)
Command=/home/bracken/testing/job.sh
WorkDir=/home/bracken/testing
StdErr=/home/bracken/testing/output-testjob-183359.out
StdIn=/dev/null
StdOut=/home/bracken/testing/output-testjob-183359.out
The QOS's for all other Research Computing resources are the following:
First follow This tutorial for setting up Rmpi.
Test job script
#!/bin/bash
## job.sh example testing Rmpi
## you should see output that has 2 different node names
# Set the name of the job
#SBATCH -J get_cluster_names
# Set a walltime for the job. The time format is HH:MM:SS
#SBATCH --time=00:00:30
# Select nodes
#SBATCH -N 2
# Select one task per node (similar to one processor per node)
#SBATCH --ntasks-per-node 12
# Set output file name with job number
#SBATCH -o output-testjob-%j.out
# Use the normal QOS
#SBATCH --qos=janus-debug
#SBATCH --mail-type=ALL #Type of email notification- BEGIN,END,FAIL,ALL
#SBATCH --mail-user=cameron.bracken@colorado.edu #Email to which notifications will be sent
nodes=2
ppn=12
np=$(($nodes*$ppn))
# Get OpenMPI in our PATH. openmpi_ipath and openmpi_ib
# can also be used if running over those interconnects.
module load openmpi/openmpi-1.6.4_gcc-4.8.1_ib
`which mpirun` -n 1 `which R` --vanilla --slave <<EOF
library(parallel)
cl <- makeCluster($np,type="MPI")
clusterCall(cl, function() Sys.info()[c("nodename","machine")])
clusterCall(cl, runif, $np)
stopCluster(cl)
mpi.quit()
EOF