OpenCRAVAT is usable on Amazon Web Services in two ways. The first is an AMI that contains an up to date version of the package and almost all annotators. The second is a CloudFormation workflow which will use the AMI to annotate variants in S3 buckets.


Finding the AMI

The AMI is a public community image in the us-east-1 region called OpenCRAVAT-version (for example OpenCRAVAT-1.7.1). Here is a link.

Using the AMI

The AMI runs CentOS 7 and consists of two 300 GB volumes, a root/job volume, and a data volume mounted at /mnt/ssd. Log in as the default user, centos. Modules are located at /mnt/ssd/oc, and config files are at /usr/local/lib/python3.6/site-packages/cravat/conf.

Instance Sizing

We recommend at least a 4 core, 4 GB machine. Expect to use approximately 300 MB of ram per 1 million variants. OpenCRAVAT will run faster with more cores, as certain parts of the annotation process are parallel. Best performance is achieved by either using an Provisioned IOPS SSD for the modules volume, or choosing an instance with an ephemeral ssd and moving the modules to it.


The OpenCRAVAT CloudFormation template can be used to automatically annotate files in S3. When run, it will create an analysis instance from the AMI, pull an input file from S3, annotate it, and place the results in S3.