Wednesday, October 8, 2008

The Northeast Biomedical High Performance Computing Collaborative

Today at the 2008 Harvard High Performance Computing Summit, we launched a bold new initiative - The Northeast Biomedical High Performance Computing Collaborative.

Massachusetts is a unique place that fosters collaborations. Whether it's the Clinical and Translational Science Awards or the New England Health EDI Network , we seem to be able to put aside our competitive tendencies and share intellectual property for the benefit of all.

Given our success with High Performance Computing at Harvard and the number of institutions needing biomedical computing resources in the Northeast, the formation of a New England-wide High Performance Computing collaborative makes great sense.

We assembled a national group of experts from industry, other high performance computing centers (Texas, San Diego, Germany, Virginia), and academia to discuss our vision. Here's our strawman plan:

Biomedical Informatics is at an exciting cross roads: the computational challenges facing researchers, clinicians and public health professionals now exceed the computational power typically available in an academic biomedical setting. This is exciting because it means that the advances in high performance computing from other disciplines (e.g. physics) can be brought to bear on the great challenges of life sciences, health and medical research. The opportunities to develop new therapies, monitor trends in ambulatory hospital data and catch and avert drug related mishaps (e.g. Vioxx) are truly astounding. With the advent of the $1,000 �ome� (genotype, phenotype, labs) � the capacity to analyze and predict longitudinally and in real time as well as the ability to hypothesis test retrospectively will challenge the computational boundaries of all biomedical research organizations. Computational power is now at the very core or our ability to rapidly advance the state of clinical care and healthcare.

As these computational challenges have risen, however, Harvard, and the Northeast in general, have not taken the lead in high performance computing. Partly because of the limited importance computing has played until recently, it has been a secondary consideration at best. As a result Harvard and the Boston/Cambridge biomedical community as a whole are not viewed as being at the forefront of the computational side of biomedical research. Moreover most researchers have had to resort to under the desk solutions run by well meaning but ill prepared post-docs � struggling to manage growing computational needs. From the pedabytes of data that will be rolling off of instruments to the millions of daily points of data from the dozens of institutions that will be participating in the new CTSA, it is imperative that the institutions in the Northeast come together to quickly remedy these deficits and move the Northeast to the forefront of HPC infrastructure in the country.

The Vision
The Northeast High Performance Computing Collaborative will be a new institution that will sit under the Harvard Medical School administratively but will have governance drawn from the participating research organizations. Under the overall direction of a rotating chair-person , the organizations will be led by a board of directors and a small administrative team. The mission of the collaboration will be to �Provide and facilitates access to shared computational infrastructure for biomedical research and reporting and to create and nurture a community of shared experiences, tools and systems across the academic, biomedical, public and private sector users of biomedical HPC tools�. Funded through a combination of philanthropic support, vendor dues, vendor equipment grants, grant funding and user fees, the Collaborative will conduct the following activities
1) Host and run the NE Biomed Collaborative Research Cluster --- this 3000+ core cluster will highlight the participating vendors technologies and will be available to any researcher affiliated with any of the participating organizations
2) Create the NE Biomed Grid � this grid will link together clusters from participating institutions as well as existing grids such as NSFnet and Amazon EC2
3) Help to facilitate the wide scale adoption of the CTSA adopted credentials to create open and shared exchange and collaboration
4) Create a web based repository of shared information and connections for the Northeast Biomed HPC community including an HPC wiki, research software mirrors, and data sharing tools
5) Make available a team of bioinformatics experts at very low cost to researchers. Vendors such would fund positions within this group that would provide special expertise in their tools as well as general tools. In addition several senior bioinformatics staff would be available on a fee basis
6) An internship program which would couple accomplished young graduate students in computer sciences and bioinformatics with researchers needing help optimizing their code for the cluster
7) Provide a research testbed for cutting edge research and technologies
8) An annual HPC outlook report which will provide data on the state of biomedical HPC
9) Conduct the annual Biomed HPC Summit

1) Provide CPU cycles to researchers
2) Provide a repository of shared tools and resources for the entire Boston based Biomed community
3) To create an open grid across all of Harvard, the Hospital affiliates and related academic institutions
4) To facilitate credentialing of individuals and access to data across institutional barriers
5) To provide a gateway to computational networks such as NSFnet
6) To allow industry-academic partnerships
7) To create a test bed within which commercial companies can test their latest technologies
8) To allow PI�s access to the latest and greatest technologies
9) To provide bioinformatics support for researchers

Our next step is to refine this vision, formalize the governance of the group, and prioritize the requirements of the stakeholders. This project will be one of my most passionate causes in 2009.
Load disqus comments