Installation¶
System requirements¶
Hardware¶
IDEAL will run CPU-intensive simulations. [GateRTion] is single threaded, so in order to get results within a clinically reasonable time frame, many instances of GateRTion need to run in parallel on sufficiently fast CPUs. While this can in principle be achieved with a single machine, in the following we assume a typical setup with a small/medium size cluster.
Submission Node¶
At least 4 cores with a clock speed greater than 2GHz.
At least 16 GiB of RAM.
Local disk space for the operating system and HT Condor, 100 GiB should be sufficient.
Shared disk: see below.
Calculation Nodes¶
In total at least 40 cores (preferably 100-200) with a clock speed greater than 2GHz.
At least 8 GiB of RAM per core [1].
Local disk space for the operating system and HT Condor, 100 GiB should be sufficient.
Network access to/from submission node¶
The submission node should be accessible by the user, or be connected with an external server that functions as the user interface. To this end, the submission node should be connected to an reasonably fast internal network that allows access to a shared directory system (typically CIFS) or HTTPS connections with at least one other server. Recommended data upload and download speed is 1Gbit/second or faster.
Software¶
Operating System¶
For all cluster nodes: Linux. Any major modern operating system (e.g. [Ubuntu] 18.04 or later) should work.
Python¶
Python [Python3] version 3.6 or later should be installed on all nodes.
Submission node: virtualenv and pip are used to install modules that are not part of the standard library.
In case the IDEAL cluster is not directly connected to the internet, the intranet should contain a repository that is accessible by the submission node and provides up to date release of the following python modules (versions are minimum versions):
HTCondor¶
IDEAL relies on the [HTCondor] cluster management system for running many simulations in parallel [2]. Any recent release (e.g. 8.6.8) should work well. All major Linux distributions provide HTCondor as a standard package. The full documentation of HTCondor can be found on the HTCondor web page. To install: ``
sudo apt update sudo apt install htcondor
`` Below some of the specific details for configuring and running HTCondor are described. These are meant as guidance, the optimal configuration may depend on the details of available cluster.
Configuration¶
Each (submit or calculation) node has HTCondor configuration files stored under /etc/condor/
.
The /etc/condor/condor_config
file contains the default settings of a subset of all configurable options.
This file should not be edited, since any edits may be overwritten by OS updates.
The settings below may be added either to the /etc/condor/condor_config.local
file, or in a series
of files /etc/condor/config.d/NNN_XXXXXX
, where NNN
are numbers (to define the order) and XXXXXX
are
keywords that help you remember what kind of settings are defined in them.
The options described below are important for running IDEAL. The values of the settings are sometimes used in the definition of other settings, so be careful with the order in which you add them.
The configuration can be identical for all nodes, except for the daemon settings.
- Condor host
The submit node should be “condor host”, which is declared by setting
CONDOR_HOST
to the IP address of the submit node:CONDOR_HOST = w.x.y.z
- Enable communication with other nodes
The simplest way to configure this is to just enable communication (“allow write”) for each node with all nodes in the cluster, including the node itself. The
ALLOW_WRITE
is a comma-separated list of all hostnames and IP addresses. For ease of reading, the nodes can be added one by one, like this:ALLOW_WRITE = $(FULL_HOSTNAME), $(IP_ADDRESS), 127.0.0.1, 127.0.1.1 ALLOW_WRITE = $(ALLOW_HOST), submit_node_hostname, w.x.y.z ALLOW_WRITE = $(ALLOW_HOST), calc_node_hostname, w.x.y.z ALLOW_WRITE = $(ALLOW_HOST), calc_node_hostname, w.x.y.z ALLOW_WRITE = $(ALLOW_HOST), calc_node_hostname, w.x.y.z
- Which daemons on which nodes
This is the only item that requires different configuration for submit and calculation nodes.
For the submit node:
DAEMON_LIST = MASTER, COLLECTOR, NEGOTIATOR, SCHEDD, GANGLIAD
For the calculation nodes:
ALLOW_NEGOTIATOR = $(CONDOR_HOST) $(IP_ADDRESS) 127.* DAEMON_LIST = MASTER, STARTD, SCHEDD
- Network and filesystem
Make sure to configure the correct ethernet port name and the full host name of the submit node
BIND_ALL_INTERFACES = True NETWORK_INTERFACE = ethernet_port_name CUSTOM_FILE_DOMAIN = submit_node_full_hostname FILESYSTEM_DOMAIN = $(CUSTOM_FILE_DOMAIN) UID_DOMAIN = $(CUSTOM_FILE_DOMAIN)
- Resource limits
HTCondor should try to use all CPU power, but refrain from starting jobs if the disk space, RAM or swap exceed some safe thresholds
SLOT_TYPE_1 = cpus=100%,disk=90%,ram=90%,swap=10% NUM_SLOTS_TYPE_1 = 1 SLOT_TYPE_1_PARTITIONABLE = True
- Resource guards
Define what to do when some already running job exceeds its resource limits
MachineMemoryString = "$(Memory)" SUBMIT_EXPRS = $(SUBMIT_EXPRS) MachineMemoryString MachineDiskString = "$(Disk)" SUBMIT_EXPRS = $(SUBMIT_EXPRS) MachineDiskString SYSTEM_PERIODIC_HOLD_memory = MATCH_EXP_MachineMemory =!= UNDEFINED && \ MemoryUsage > 1.0*int(MATCH_EXP_MachineMemoryString) SYSTEM_PERIODIC_HOLD_disc = MATCH_EXP_MachineDisk =!= UNDEFINED && \ DiskUsage > int(MATCH_EXP_MachineDiskString) SYSTEM_PERIODIC_HOLD = ($(SYSTEM_PERIODIC_HOLD_disc)) || ($(SYSTEM_PERIODIC_HOLD_memory)) SYSTEM_PERIODIC_HOLD_REASON = ifThenElse(SYSTEM_PERIODIC_HOLD_memory, \ "Used too much memory", ""), ifThenElse(SYSTEM_PERIODIC_HOLD_disc, \ "Used too much disk space","Reason unknown") MEMORY_USED_BY_JOB_MB = ResidentSetSize/1024 MEMORY_EXCEEDED = ifThenElse(isUndefined(ResidentSetSize), False, ( ($(MEMORY_USED_BY_JOB_MB)) > RequestMemory )) PREEMPT = ($(PREEMPT)) || ($(MEMORY_EXCEEDED)) WANT_SUSPEND = ($(WANT_SUSPEND)) && ($(MEMORY_EXCEEDED)) =!= TRUE WANT_HOLD = ( $(MEMORY_EXCEEDED) ) WANT_HOLD_REASON = \ ifThenElse( $(MEMORY_EXCEEDED), \ "$(MemoryUsage) $(Memory) Your job exceeded the amount of requested memory on this machine.",\ undefined )
- Miscellaneous
########################################## COUNT_HYPERTHREAD_CPUS=FALSE START = TRUE SUSPEND = FALSE PREEMPT = FALSE PREEMPTION_REQUIREMENTS = FALSE KILL = FALSE ALL_DEBUG = D_FULLDEBUG D_COMMAND POOL_HISTORY_DIR = /var/log/condor/condor_history KEEP_POOL_HISTORY = True MaxJobRetirementTime = (1 * $(MINUTE)) CLAIM_WORKLIFE = 600 MAX_CONCURRENT_DOWNLOADS = 15 MAX_CONCURRENT_UPLOADS = 15
sudo condor_master (sudo not needed in virtual machine) condor_reconfig
`
NOTE: condor_master needs to be started only once, on EACH MACHINE belonging to the cluster
To check that condor is running and that all machines are correctlyincluded in the cluster, the user can run:
`
condor_status
ROOT¶
Any release with major release number 6 should work.
Geant4¶
GATE-RTion requires Geant4 version 10.03.p03, compiled without multithreading.
GATE-RTion¶
GATE-RTion [GateRTion] is a special release of Gate [GateGeant4], dedicated to clinical applications in pencil beam scanning particle therapy. If all nodes run the same hardware, then this can be compiled and installed once on the shared disk of the cluster and then be used by all nodes. If the different cluster nodes have different types of CPU then it can be good to compile Geant4, ROOT and Gate-RTion separately on all nodes and install it on the local disks (always under the same local path).
After installation, a short shell script should be created that can be “sourced” in order to set up the shell environment
for running Gate
, including the paths not only of Gate
itself, but also of the Geant4 and ROOT libraries and data sets.
For instance:
source "/usr/local/Geant4/10.03.p03/bin/geant4.sh"
source "/usr/local/ROOT/v6.12.06/bin/thisroot.sh"
export PATH="/usr/local/GATE/GateRTion-1.0/bin:$PATH"
IDEAL installation¶
Installing the IDEAL scripts¶
For the current 1.0rc release, IDEAL is obtained by cloning from GitLab or unpacking a tar ball, provided by at the IDEAL source repository: https://gitlab.com/djboersma/ideal. In a future release (1.0), we hope that the code can simply be installed with pip install ideal
(which would then also perform some of the post-install steps). The code should be installed on the shared disk of the IDEAL cluster. The install directory will be referred to in this manual as the “IDEAL top directory”. The IDEAL top directory has the following contents:
Name |
Type |
Description |
---|---|---|
bin |
Folder |
Executable scripts and |
cfg |
Folder |
System configuration file(s) |
docs |
Folder |
Source file for this documentation |
ideal |
Folder |
Python modules implementing the IDEAL functionality |
gpl-3.0.txt |
File |
Open Source license, referred to by the LICENSE file |
LICENSE |
File |
Open Source license |
RELEASE_NOTES |
File |
Summary of changes between releases |
The first_install.py script¶
IDEAL will not function correctly immediately after a clean install (cloning it from GitLab or extracting it from a tar ball).
Right after the install, it is recommended to run the bin/first_install.py
script. This script will attempt to create a minimal working setup:
Some additional python modules installed (using
virtualenv
) in a so-called “virtual environment” namedvenv
.Folders for commissioning data (definitions of beam lines, CTs, phantoms), logs, temporary data and output need to be created.
The available resources and the simulation preferences need to be specified in a “system configuration” file
cfg/system.cfg
in the IDEAL install directory.
The script tries to perform all the trivial steps of the installation. Simple examples of a beam line model, CT protocols and a phantom are provided. These examples are hopefully useful to give an idea of where and how you should install your own beam models, CT protocols and phantoms. The details are described in the Commissioning chapter.
This script is supposed to be run after all previous steps have been performed. Specifically:
A Linux cluster is available running the same OS on all nodes (e.g. Ubuntu 18.04) and with a fast shared disk that is accessible by all cluster nodes and has at least 200 GiB of free space.
Geant4, ROOT and GateRTion should all be installed on the shared disk. A
gate_env.sh
shell script is available to configure a shell environment (source /path/to/gate_env.sh
) such that these are usable. Specifically, theGate --version
command should return the version blurb corresponding to “GateRTion 1.0”.HTCondor is installed and configured. All nodes on the Linux cluster run the same OS (e.g. Ubuntu 18.04).
Python version 3.6 or newer and
virtualenv
are installed.
The first_install.py
script will thoroughly check these assumptions but the checks are not exhaustive.
The minimum input for the script is the file path of the gate_env.sh script. It is recommended to also give the name of the clinic (with the -C option). Many more options are available, see the script’s ‘–help’ output.
Installing necessary python modules¶
The installation step described in this section is performed by the first_install.py script.
If you did not run the first_install.py
script, then please read the rest of this section.
IDEAL needs several external python modules that are not included in a default
python installation. In order to avoid interference with python module needs
for other applications, the preferred way of installing these modules is using
a virtual environment called venv
in the IDEAL top directory. This may be
done using the following series of commands (which may be provided in an
install script in a later release of IDEAL) in a bash shell after a cd
to
the IDEAL top directory:
virtuelenv -p python3 --prompt='(IDEAL 1.0) ' venv
source ./venv/bin/activate
pip install filelock htcondor itk matplotlib numpy pydicom
pip install python-daemon python-dateutil scipy
deactivate
(The modules ipython
, Sphinx
and PyQt5
are optional. The first enables
interactive, python-based analysis, the second enables you to generate these docs
yourself, and the third enables the somewhat clunky sokrates.py
GUI interface.)
If you decide to install the virtual environment under a different path, then
you need to edit the bin/IDEAL_env.sh
script to use the correct path for
source /path/to/virtualenv/bin/activate
line, or to remove that line
altogether.
Installing additional python modules¶
You can of course add extra modules with pip
. There are three modules in
particular that might be desirable when working with IDEAL:
ipython
: a python command line program, which can be useful for debugging (e.g., query DICOM files using thepydicom
module)
Sphinx
: enables you to generate these docs yourself (cd docs; make html
).
PyQt5
: enables running thesokrates.py
GUI. It’s a bit clunky, but some users like it.
In a fresh shell, cd
to the IDEAL install directory and then run:
source ./venv/bin/activate
pip install ipython Sphinx PyQt5
deactivate
Alternatively, in a shell in which you already ran source bin/IDEAL_env.sh
,
you can directory run pip install ipython Sphinx PyQt5
.
Set up the data directories¶
Like the virtual environment, this installation step may be automated in an
installation step in the next release. IDEAL needs a couple of folder to store
logging, temporary data and output, respectively. In a bash shell after a cd
to
the IDEAL top directory, do:
mkdir data
mkdir data/logging
mkdir data/workdir
mkdir data/output
mkdir data/MyClinicCommissioningData
The subdirectories of data
are described in more detail below.
logging
¶
The logging
directory is where all the debugging level output will be stored. In case something goes
wrong, these logging files may help to investigate what went wrong. When you report issues to the
developers, it can be useful to attach the log file(s).
workdir
¶
The workdir
directory will contain a subfolder for every time you use IDEAL
to perform a dose calculation. The unique name of each subfolder is composed
of the user’s initials, the name and/or label of the plan and a time stamp of
when you submitted the job. The subfolder will contain all data to run the
GATE simulations, preprocessing the input data and postprocessing the output
data:
The GATE directory with the
mac
scripts and data needed to run the simulations.Temporary output, saved every few minutes, from all condor subjobs running this simulation.
Files that are used or generated by HTCondor for managing all the jobs.
Three more IDEAL-specific log files, namely:
preprocessor.log
,postprocessor.log
andjob_control_daemon.log
.
The temporary data can take up a lot of space, typically a few dozen GiB, depending on the number of voxels in CT (after cropping it to a minimal bounding box containing the “External” ROI and the TPS dose distribution) and on the number of cores in your cluster. After a successful run, the temporary data is archived in compressed form, for debugging analysis in case errors happened or if there are questions about the final result.
Note
When an IDEAL job runs unsuccessfully, the temporary data is NOT automatically compressed/archived, since the user may want to investigate. Do not forget to delete or compress these data after the investigation has concluded, to avoid inadvertently filling up the disk too quickly.
After compressed archiving job work directory still still occupies up to a few GiB per plan, which will add up when running IDEAL routinely for many plans [3].
output
¶
The output
directory will contain a subfolder of each IDEAL job, using the
same naming scheme as for the work directories.
In IDEAL’s system configuration file the user (with
admin/commissioning role) can define which output will actually be saved, e.g.
physical and/or effective dose, DICOM and/or MHD. This output directory serves
to store the original output of the IDEAL job. If the path of a second output
directory is given in the system configuration file, then
the job output subfolder will be copied to that second location (e.g. on a CIFS
file share, where it can be accessed by users on Windows devices).
Commissioning Data Directory¶
In the example MyClinic
could be replaced by the name of your particle therapy clinic.
If you are a researcher who studies plans from multiple different clinics, may
want to create a commissioning data directory for each clinic.
This directory will contain the commissioning data for your particle therapy clinic. The details are laid out in the commissionig chapter.
Footnotes