May 23, 2018

OICR’s Cancer Genome Collaboratory wins 2018 OpenStack Superuser award for contributions to the cancer research community

Vincent Ferretti's lab at work.

Based on popular vote and review by the Superuser Editorial Advisory Board, OICR’s Cancer Genome Collaboratory team has won the 2018 OpenStack Vancouver Summit Superuser Award. The Award recognizes OICR’s use of OpenStack, an open-source software platform for cloud computing, to enable cancer research worldwide. Previous winners of the Superuser Award include AT&T, CERN and Comcast.

“We’re proud to be recognized by the greater research community that we support,” Vincent Ferretti, Director and Senior Principal Investigator, Genome Informatics at OICR, says. “OpenStack has helped us contribute to the cancer research community in Ontario, across Canada and internationally.”

Continue reading – OICR’s Cancer Genome Collaboratory wins 2018 OpenStack Superuser award for contributions to the cancer research community

October 25, 2016

From our Annual Report: Constructing the cloud

Illustration showing clouds

OICR’s reputation as leader in managing and analyzing big data has grown over the past year as the Institute has worked with private and public partners to bring more genomic and health data to the cloud.

Continue reading – From our Annual Report: Constructing the cloud

May 2, 2016

OICR joins the Collaborative Cancer Cloud

A connected cloud - decorative.

On March 31, Intel Corporation and the Knight Cancer Institute at Oregon Health and Sciences University announced two new leading cancer centres have joined the Collaborative Cancer Cloud (CCC): Dana Farber Cancer Institute and OICR.

The CCC is a distributed precision medicine analytics platform that allows institutions to securely share and analyze large amounts of data while also preserving patient privacy and security. The CCC will make it easier, faster and more affordable to determine how genes interact to cause disease in individual patients.

Continue reading – OICR joins the Collaborative Cancer Cloud

February 19, 2016

ICGC and Amazon Web Services are bringing more genomic health data to researchers in the cloud

Amazon Web ServicesThe International Cancer Genome Consortium (ICGC) took another major step into the cloud last month, joining forces with Amazon Web Services (AWS) to bring 1,200 encrypted whole genome sequences to more researchers worldwide.

This means that one of the world’s largest collections of cancer genome data is now more easily accessible to qualified researchers, potentially accelerating the development of new treatments for cancer patients.

Continue reading – ICGC and Amazon Web Services are bringing more genomic health data to researchers in the cloud

December 3, 2015

The call for a cloud commons in genomics

Cloud Computing - Servers

The amount of genomics data produced today is enormous and challenges in accessing that vast amount of data are increasingly blocking the ability for scientists to perform their research. There is a growing consensus that cloud storage for genomic data makes research far more accessible than traditional methods of storing it locally. In a major shift earlier this year, the National Institutes of Health (NIH) removed its nearly decade-long ban on storing genomic data in the cloud, which was seen by many as a major step forward in the shift toward cloud computing in genomics.

Continue reading – The call for a cloud commons in genomics

November 18, 2015

The International Cancer Genome Consortium brings more genomic health data to researchers on the Amazon Web Services Cloud

Toronto – (November 18, 2015) The International Cancer Genome Consortium (ICGC) announced today that 1,200 encrypted cancer whole genome sequences are now securely available on the Amazon Web Services (AWS) Cloud for access by cancer researchers worldwide.

The Ontario Institute for Cancer Research (OICR), which houses the ICGC’s Data Coordination Center (DCC), copied ICGC genome data onto the AWS Cloud and is providing authorized researchers with credentials to access and analyze the data using secure mechanisms. The ICGC Data Access Compliance Office has established a framework that protects the confidentiality of research participants while working to ensure that the research will benefit future cancer patients.

The newly launched initiative means one of the world’s largest collections of cancer genome data is now more easily accessible to qualified researchers, which will enhance collaboration and potentially accelerate the development of new treatments for cancer patients.

Cloud solutions have become essential to genomics research because of the vast amount of data produced by researchers and the difficulties inherent in transferring such large datasets between sites. Projects can quickly grow to several petabytes in size, with each petabyte being the equivalent of data on 223,000 DVDs. Very few institutions around the world have the capacity to download such immense datasets for analysis, and this has limited the number of researchers who can access genome projects and the scope of what can be done with the data.

With cloud computing, researchers don’t need to download data. They can work with data and run experiments in the cloud, a flexible network of servers on the Internet, and access data in minutes rather than months. Data stored in the cloud has been shown to be as secure, if not more so, than data downloaded to local servers and hard drives. The set of 1,200 genomes now available on AWS is the first installment of ICGC data to be posted and is expected to grow several fold over the next 12 months with the addition of data from more cancer patients.

“This initiative brings together one of the world’s largest cancer genome datasets and one of the world’s leading cloud computing providers to create a powerful new resource for cancer researchers,” said Dr. Lincoln Stein, Director of the Informatics and Biocomputing Program at the Ontario Institute for Cancer Research and Director of the ICGC’s Data Coordination Center. “Now, far more researchers will have access to ICGC data, opening up the possibility of new discoveries and new breakthroughs in cancer research.”

The Pan-Cancer Analysis of Whole Genomes (PCAWG) project of the ICGC and The Cancer Genome Atlas (TCGA) is coordinating analysis of more than 2,800 cancer genomes, and is making extensive use of AWS and the genomes stored on Amazon Simple Storage Service (Amazon S3). Each genome is being characterized through a suite of standardized algorithms, including alignment to the reference genome, uniform quality assessment, and the calling of multiple classes of somatic mutations. Scientists participating in the research projects of PCAWG are addressing a series of fundamental questions about cancer biology and evolution based on these data.

“Making this data available and usable will enable more researchers across the world to ask questions and get answers that were previously out of reach,” said Matt Wood, General Manager of Product Strategy at Amazon Web Services, Inc. “Researchers can now explore these large and diverse datasets in unconstrained ways, without having to manage large amounts of physical infrastructure. Instead, they can focus on driving their state-of-the-art research forward.”

“Cancer research is becoming increasingly data-heavy. Compiling the data, organizing the data, analyzing the data, making the data available to all researchers—these are fundamental to making further progress in cancer genome research, and we are excited at the possibilities of working with innovative cloud-based computing systems to achieve these advances,” said Peter Campbell, Head of Cancer Genetics and Genomics at the Wellcome Trust Sanger Institute, who is helping to lead the PCAWG project.

“In the next year, it is estimated that 14 million people worldwide will learn that they have cancer. In order to accelerate our understanding of this disease and ultimately provide better treatment, it is critical that we develop solutions able to meet the scale of this challenge. Co-localizing ICGC data as well as other cancer genomics data sets like The Cancer Genome Atlas with secure and scalable computation resources represents a major step forward for both researchers and patients. With ICGC data available on AWS, we utilized the Seven Bridges platform to perform variant calling on hundreds of genomes weeks faster than would have been possible using local infrastructure,” said Deniz Kural, CEO of Seven Bridges Genomics and Principal Investigator of one of three NCI-funded Cancer Genomics Cloud pilot projects.

“This effort to provide the ICGC datasets on AWS will lower the barriers currently associated with computing on thousands of genomes. Users will have the ability to quickly analyze datasets within the cloud on highly scalable infrastructure. This is a paradigm shift from the old model of slowly downloading data to a user’s local infrastructure before any meaningful work can commence,” said Brian O’Connor, Managing Director of Cloud Computing at the Ontario Institute for Cancer Research.

“The ICGC Data Access Compliance Office (DACO) has been a forerunner in providing controlled, secure, and efficient access to cancer genomic data to members of the research community. It now welcomes the opportunity to further advance research for the benefit of all cancer patients by enabling controlled cloud access to ICGC genomic data stored on AWS. Throughout the process, DACO will implement a robust governance framework to ensure a high degree of privacy protection to patients’ genetic and health data,” said Yann Joly, Data Access Officer, ICGC DACO, McGill University.

“This exciting collaboration and new use for cloud technology is the future of cancer research. Ontario is proud to be part of this initiative through the Ontario Institute for Cancer Research and we look forward to seeing this relationship help cancer patients around the world,” said Reza Moridi, Ontario’s Minister of Research and Innovation.

There are currently 89 ICGC projects underway at research institutes in Asia, Australia, Europe, North America, and South America. These projects seek to identify the genomic drivers of cancer and will help to lay the foundation for developing treatments tailored to patients’ individual needs. The Consortium leads worldwide efforts to map the genomes of both common and rare cancers and has the goal of identifying cancer-causing mutations in more than 25,000 tumours representing more than 50 types of cancer of clinical and societal importance across the globe.

The ICGC develops policies and quality control criteria to help harmonize the work of member projects located in different jurisdictions. Data produced by ICGC projects are made rapidly and freely available to qualified researchers around the world via the cloud and through the ICGC Data Coordination Center at (http://dcc.icgc.org).

For more information and updates about ICGC activities, please visit the website at: www.icgc.org.

November 18, 2015

The International Cancer Genome Consortium brings more genomic health data to researchers on the Amazon Web Services Cloud

Toronto – (November 18, 2015) The International Cancer Genome Consortium (ICGC) announced today that 1,200 encrypted cancer whole genome sequences are now securely available on the Amazon Web Services (AWS) Cloud for access by cancer researchers worldwide.

The Ontario Institute for Cancer Research (OICR), which houses the ICGC’s Data Coordination Center (DCC), copied ICGC genome data onto the AWS Cloud and is providing authorized researchers with credentials to access and analyze the data using secure mechanisms. The ICGC Data Access Compliance Office has established a framework that protects the confidentiality of research participants while working to ensure that the research will benefit future cancer patients.

The newly launched initiative means one of the world’s largest collections of cancer genome data is now more easily accessible to qualified researchers, which will enhance collaboration and potentially accelerate the development of new treatments for cancer patients.

Cloud solutions have become essential to genomics research because of the vast amount of data produced by researchers and the difficulties inherent in transferring such large datasets between sites. Projects can quickly grow to several petabytes in size, with each petabyte being the equivalent of data on 223,000 DVDs. Very few institutions around the world have the capacity to download such immense datasets for analysis, and this has limited the number of researchers who can access genome projects and the scope of what can be done with the data.

With cloud computing, researchers don’t need to download data. They can work with data and run experiments in the cloud, a flexible network of servers on the Internet, and access data in minutes rather than months. Data stored in the cloud has been shown to be as secure, if not more so, than data downloaded to local servers and hard drives. The set of 1,200 genomes now available on AWS is the first installment of ICGC data to be posted and is expected to grow several fold over the next 12 months with the addition of data from more cancer patients.

“This initiative brings together one of the world’s largest cancer genome datasets and one of the world’s leading cloud computing providers to create a powerful new resource for cancer researchers,” said Dr. Lincoln Stein, Director of the Informatics and Biocomputing Program at the Ontario Institute for Cancer Research and Director of the ICGC’s Data Coordination Center. “Now, far more researchers will have access to ICGC data, opening up the possibility of new discoveries and new breakthroughs in cancer research.”

The Pan-Cancer Analysis of Whole Genomes (PCAWG) project of the ICGC and The Cancer Genome Atlas (TCGA) is coordinating analysis of more than 2,800 cancer genomes, and is making extensive use of AWS and the genomes stored on Amazon Simple Storage Service (Amazon S3). Each genome is being characterized through a suite of standardized algorithms, including alignment to the reference genome, uniform quality assessment, and the calling of multiple classes of somatic mutations. Scientists participating in the research projects of PCAWG are addressing a series of fundamental questions about cancer biology and evolution based on these data.

“Making this data available and usable will enable more researchers across the world to ask questions and get answers that were previously out of reach,” said Matt Wood, General Manager of Product Strategy at Amazon Web Services, Inc. “Researchers can now explore these large and diverse datasets in unconstrained ways, without having to manage large amounts of physical infrastructure. Instead, they can focus on driving their state-of-the-art research forward.”

“Cancer research is becoming increasingly data-heavy. Compiling the data, organizing the data, analyzing the data, making the data available to all researchers—these are fundamental to making further progress in cancer genome research, and we are excited at the possibilities of working with innovative cloud-based computing systems to achieve these advances,” said Peter Campbell, Head of Cancer Genetics and Genomics at the Wellcome Trust Sanger Institute, who is helping to lead the PCAWG project.

“In the next year, it is estimated that 14 million people worldwide will learn that they have cancer. In order to accelerate our understanding of this disease and ultimately provide better treatment, it is critical that we develop solutions able to meet the scale of this challenge. Co-localizing ICGC data as well as other cancer genomics data sets like The Cancer Genome Atlas with secure and scalable computation resources represents a major step forward for both researchers and patients. With ICGC data available on AWS, we utilized the Seven Bridges platform to perform variant calling on hundreds of genomes weeks faster than would have been possible using local infrastructure,” said Deniz Kural, CEO of Seven Bridges Genomics and Principal Investigator of one of three NCI-funded Cancer Genomics Cloud pilot projects.

“This effort to provide the ICGC datasets on AWS will lower the barriers currently associated with computing on thousands of genomes. Users will have the ability to quickly analyze datasets within the cloud on highly scalable infrastructure. This is a paradigm shift from the old model of slowly downloading data to a user’s local infrastructure before any meaningful work can commence,” said Brian O’Connor, Managing Director of Cloud Computing at the Ontario Institute for Cancer Research.

“The ICGC Data Access Compliance Office (DACO) has been a forerunner in providing controlled, secure, and efficient access to cancer genomic data to members of the research community. It now welcomes the opportunity to further advance research for the benefit of all cancer patients by enabling controlled cloud access to ICGC genomic data stored on AWS. Throughout the process, DACO will implement a robust governance framework to ensure a high degree of privacy protection to patients’ genetic and health data,” said Yann Joly, Data Access Officer, ICGC DACO, McGill University.

“This exciting collaboration and new use for cloud technology is the future of cancer research. Ontario is proud to be part of this initiative through the Ontario Institute for Cancer Research and we look forward to seeing this relationship help cancer patients around the world,” said Reza Moridi, Ontario’s Minister of Research and Innovation.

There are currently 89 ICGC projects underway at research institutes in Asia, Australia, Europe, North America, and South America. These projects seek to identify the genomic drivers of cancer and will help to lay the foundation for developing treatments tailored to patients’ individual needs. The Consortium leads worldwide efforts to map the genomes of both common and rare cancers and has the goal of identifying cancer-causing mutations in more than 25,000 tumours representing more than 50 types of cancer of clinical and societal importance across the globe.

The ICGC develops policies and quality control criteria to help harmonize the work of member projects located in different jurisdictions. Data produced by ICGC projects are made rapidly and freely available to qualified researchers around the world via the cloud and through the ICGC Data Coordination Center at (http://dcc.icgc.org).

For more information and updates about ICGC activities, please visit the website at: www.icgc.org.

July 9, 2015

World-leading Big Data researchers call for support for more accessible and more effective storage of data in the cloud to facilitate genomics research

Improved support of cloud infrastructure is essential to the delivery of the next generation of treatments for major diseases like cancer

TORONTO, ON (July 9, 2015) Today in the journal Nature prominent researchers from Canada, Europe and the U.S. have made a powerful call to major funding agencies, asking them to commit to establishing a global genomic data commons in the cloud that could be easily accessed by authorized researchers worldwide.

This would increase access to the data for researchers, reduce the time and cost associated with transferring and storing data on local servers and accelerate genomics research worldwide. Storing data in the cloud has been shown to be as secure, if not more secure, than storing it locally.

With a typical university connection it can take months to download datasets from major international projects like the International Cancer Genome Consortium (ICGC) and the hardware costs associated with storing and processing those data can also prove quite expensive.

With cloud computing a data set from a big genome project can be executed in days, at a fraction of the price.

The authors propose that funding agencies request that major data sets be uploaded into the cloud and that they pay for its long-term storage. Data would then only need to be copied once and researchers would only have to pay for temporary storage while the analysis was in progress. Access would only be provided to authorized researchers.

“Currently a great deal of valuable time and money is spent by researchers transferring data from a repository to their own preferred server, instead of easily and cheaply tapping into a global data commons whenever they need to,” said Dr. Lincoln Stein, Director of the Informatics and Bio-computing Program at the Ontario Institute for Cancer Research, leader of the ICGC’s Data Coordination Center in Toronto and a lead author on the paper. “We encourage a larger investment in the cloud in order to use public funds more effectively and to help accelerate the pace of genomics research.”

“Having authorized access procedures in place ensures respect for the wishes of data donors, including that their data be used safely and securely,” said Dr. Bartha Knoppers, Director of the Centre of Genomics and Policy, McGill University. “Applying the Framework for Responsible Sharing of Genomic and Health-Related Data (www.genomicsandhealth.org) is a first step in enacting the human right of citizens to benefit from scientific advances and of scientists to be recognized for their work.”

“The complexity of cancer biology means that we need huge data sets – basically, the bigger the better,” said Dr. Peter Campbell, Head of Cancer Genomics at the Wellcome Trust Sanger Institute. “We have now reached a stage where these data sets are too large to move around – cloud computing offers us the flexibility to hold the data in one virtual location and unleash the world’s researchers on it all together.”

“The amount of genomic data is growing at an amazing rate. Moving data and analysis tools to the cloud will democratize access to data and to the computational resources required to analyze that data,” said Dr. Gad Getz, Director of the Cancer Genome Computational Analysis Group at the Broad Institute of MIT and Harvard. “The expanded access will accelerate tool development, grow the population of researchers analyzing these rich data sets and ultimately increase the pace of scientific discovery. These cloud-based analysis platforms will also enable the testing of new distributed computing paradigms which expand both the scale of the analyses and the sophistication of the computational algorithms. We are now building a pilot of such a cloud platform.”

“The establishment of novel powerful cloud computing frameworks enabling us to store, share and analyze data across borders will open new perspectives in cancer research,” said Dr. Jan Korbel, group leader at the European Molecular Biology Laboratory (EMBL). “These will take into consideration developments in science and policies for the distribution and sharing of data sets as sensitive as patient genetic data ensuring a safe environment to serve the interests of both sample donors and researchers.”

Cloud computing is most widely associated with consumer products, such as storing music, photos or editing documents in real time. But in fact a great deal of research is already conducted in the cloud, safely and securely. Cloud computing is shared resource, giving researchers access to storage and computing power as needed, instead of making a long term investment in computer infrastructure. This also maximizes the use of the infrastructure as it can be used by many researchers instead of just one.