How Regeneron scaled its cloud to support millions of samples a year

In 2014, the Regeneron Genetics Center demonstrated how cloud infrastructure can rapidly transform a drug discovery program by enabling the generation, delivery and analysis of genetic results at unprecedented speeds. Six years and two orders of magnitude later, scale is no longer measured in numbers of samples but in the diversity of the RGC’s collaborator network, the successful programs of industry partners, and the unrealized opportunities of the world’s largest genetic resource.

A research ecosystem like the RGC demands an equally complex and flexible compute infrastructure. Internal target discovery programs leverage full-stack Enterprise Applications to browse, combine, and iterate data and results that drive novel insights like GPR75. Methods innovation in association testing (GloWGR) and variant identification (DeepVariant) requires interoperable platforms deploying best-in-class technology like Databricks and NVIDIA GPUs. Most critically, these data nucleate a hybrid industry/academic community that requires the RGC and its partner DNAnexus to manage hundreds of collaborators with a full spectrum of use cases, ensuring participant confidentiality through unparalleled international compliance, protecting partner IP through autonomous research and billing environments, and driving the value of genetic data through cost-efficient sharing of genetic data, tools, and results.

In this webinar you will learn:
  • The evolution of RGC’s hybrid cloud computing model to support millions of samples per year
  • Expanding collaboration and funding models to include flexible bilateral, multi-party, and consortia to scale the partner network
  • Rapid deployment of custom cloud security environments for time-sensitive collaborations like COVID-19
  • Democratizing the data by avoiding preferencing labs with large on-premise infrastructure
  • Providing data management and analysis capabilities to accelerate RGC partner’s ability to work with the data generated by the RGC
watch now
Watch the webinar on demand
William Salerno

William Salerno

Senior Director of Genome & Sequencing Informatics, Regeneron Genetics Center

Will Salerno is the Sr. Director of Genome and Sequencing Informatics at the Regeneron Genetics Center, where his production group leads all primary and secondary genomic analysis across the RGC’s hybrid-cloud infrastructure. Processing more than 500k samples a year, the RGC collaborates with more than 100 academic and industry partners with a focus on improving population and phenotypic diversity in genomic medicine. Active projects include finalizing the UK Biobank 500k Exome data release, one-million-scale joint genotyping, and the creation of IP-bounded collaborator environments. Prior to joining the RGC, Will led Next-Gen Sequencing Informatics for the Human Genome Sequencing Center at Baylor College of Medicine.

Arsalan Arif

arsalan arif

Founder & CEO, Endpoints News

Arsalan Arif is a news media entrepreneur who set out in 2015 to build his vision of an independent biotech news company at Endpoints News.