We take advantage of Docker as a containerisation technology to ensure reproducibility and version control, as well as decoupling the applications from the underlying architecture. As well as being able to run on the CyVerse infrastructure, you can effectively download the Docker images and run the software on your machine if you chose to do so. If you are a developer, writing your own Dockerfile and making it available in the public registry is a great first step for your software to become available to the thousands of CyVerse users.
Running jobs this way requires the user to register for a CyVerse account (it’s free, and you only need one for both US and UK systems!). If can get a CyVerse login here. There are two different types of applications in the CyVerse ecosystem: Agave apps and Discovery Environment apps.
Currently all applications running in the EI cluster are Agave applications. This allows you to run jobs in a variety of ways:
- Using the Discovery Environment: this is usually the preferred way as the Discovery Environment provides you with an friendly graphical interface with additional functionality to manage your data, save your analyses and pipelines, and share jobs and data with collaborators.
Unfortunately the Discovery Environment, while it usually integrates very well with the Agave API, will fail to allow you to select multiple files in the graphical interface even when the application allows it. If this happens to be your case you can temporarily(*) rely on one of the following methods.
- You can rely on the Agave API from command line (here‘s the CLI) and submit jobs writing your own JSON file. This can be a bit more tedious, but on GitHub we provide example JSON files as a guideline.
- Using the CyVerse UK web interface: this allows you to submit multiple files. It also has the plus side of showing you only the applications running at EI, and it’s fully integrated with the UK Data Store (so you need to contact us to be added to the list of users prior to running your analyses)
I’m a Developer…If you are a developer CyVerse is a good platform to share your software and allow the research community to use it and reproduce analyses. You are welcome to add your own tools, or you can contact us for help. If you need help getting to grips with setting up your own workflows, then please get in touch.
Collaborator VMsIf you require additional computational power or a full Linux environment for development and analyses we can provide you with a custom Virtual Machine hosted in our private cloud.
Please contact us to discuss your requirements and what kind of support/capacity you need.
Data StoreWe are working on a full federation of our UK Data Store with the Data Store in the US. In the meantime you can store your data in our UK Data Store to give you geographical advantages with respect to access speed, and if you have any legal requirements that dictate your data has to remain within the UK/EU jurisdiction. Our UK Data Store uses the iRODS data mangement system in the same way as the US version, and the best way to get started would be for you to get comfortable with
icommands(see some instructions on how to set it up. The settings in the docs point you to the US, but you can get in touch and we’ll help you changing them to point to the UK. iRODS allows us to have a full integration between the systems (together with providing fast and reliable data transfer and accessibility across the zones), so you’ll be able to see and use your data in the Discovery Environment in the same way no matter if it is housed in the UK or the US. Finally, we have a new Data Commons collection at
/iplant/home/shared/uk_data_commons, which acts as a place that you can store and share your data publicly, for example in connection with a publication. This allows other users of CyVerse to gain access to and analyse your data immediately, which aids reproducibility and open science. Please contact us if you want to store your data openly in the UK Data Commons.