Setting up the ATG Prescreen
Going to the VirtualFlow Working Directory
To set up the workflow, go to the folder VFVS/tools
This is the working directory of VirtualFlow, where all the commands to run the software are started. Enter the following command to go to this folder:
Preparing the all.ctrl
Configuration File
all.ctrl
Configuration FileFrom the tools
folder, go to the templates
folder. Enter the following command:
In the templates
folder, there is a file called all.ctrl
. Enter the following command to make sure it is there:
The file all.ctrl
needs to be edited. It needs to be edited according to the cluster/batch system that will be used. There are two possible options for cluster/batch systems: Slurm and AWS Batch. This tutorial will focus on AWS Batch.
To edit the all.ctrl
file in the command line, use a command line editor. The two common editors are nano
or vim
. To use vim, enter the following command:
This command will show the contents of the all.ctrl
file within the command line interface. To continue editing the all.ctrl
file in vim
, please refer to other resources to learn specific vim
editor commands. The following instructions require you to be comfortable with editing a file using a command line editor.
All the options for editing are documented in the all.ctrl
file itself. Edit the following portions of the preconfigured file:
object_store_data_bucket Bucket name for the input collection data. In this tutorial, we use the Enamine REAL Space with 69B molecules. To obtain access to this bucket, you need to request access on the following page of the VirtualFlow homepage (at the bottom of the page): https://virtual-flow.org/real-space-2022q12 After registration, you will receive an email typically within one working day, with the access information. Please note that an AWS Account is needed to get access to the library. If you do not have an AWS account, you might want to create one. Access to the library is free, and there are no costs associated with downloading/accessing it via AWS.
Preparing the workflow
Folder
workflow
FolderOnce the all.ctrl
has been edited, return back to the tools
directory. Enter the following command:
To prepare the workflow
and output-files
folders, enter the following command:
Warning: If you have previously set up the workflow
and output-files
folders in this directory then the above command will let you know that it already exists. If you are sure you want to delete the existing data, then run with --overwrite
.
Note that when you run the above command the workflow is set up using the state of all.ctrl
and todo.all
at that time. Changes to those files in the tools/templates
folder after this point will not be used by the workflow unless vfvs_prepare_folders.py
is run again before the workflow is started.
VFVS can process billions of ligands, and in order to process these efficiently it is helpful to segment this work into smaller chunks. A work unit is a segment of work that contains many 'subjobs' that are the actual execution elements. Often a work unit will have many subjobs and each subjob will contain about 60 minutes worth of computation.
After preparing the folders in the previous section, the Workflow can be initiated with the following sequence of commands. First, enter the following command:
Pay attention to how many work units are generated. The final line of output of the above command (displayed in the command line interface) will provide the number of work units generated.
Then we need to prepare the docker images:
Last updated