Next Generation Technology to Connect Researchers
Data transfer is one of the biggest problems currently being faced by genomic researchers that need to move massive files and volumes of data – outdated infrastructure and unreliable software solutions have made sharing big data an expensive and time consuming process. We caught up with Aspera’s Michael Ortega to learn how their revolutionary transfer system is helping to connect researchers across the world.
FLG: How would you describe what Aspera does?
MO: Aspera’s mission is to create next generation transport technology that allows people to reliably move large digital assets quickly and securely over wide area networks. Over the last decade, file sizes in the field of genomics have become enormous with NGS sequencers pumping out hundreds of gigabytes of data a day. That data needs to be shared around the world with researchers, clinicians, and bioinformaticians. Aspera helps move this data securely, reliably, and as quickly as possible, regardless of file size, network conditions, or distance.
If you’ve ever resorted to mailing a computer hard drive to get your data to point “B” or tried to send, share, or download large amounts of data over the internet, then you know it can be painfully slow and risky. The basic protocol for moving data on the internet is TCP (Transmission Control Protocol). Unfortunately, TCP was developed in the 70’s and is not well suited for big volumes of data and distribution needs across geographies, including globally and to the cloud, which are relevant today. Most life sciences organizations rely on FTP which is an old TCP-based transfer technology to move their data, so they face all the same challenges of moving big data across global networks. This is the fundamental problem that Aspera has solved to deliver performance unlike any other solution.
We do this with our patented FASP® (Fast, Adaptive and Secure Protocol) technology, which is built into the core of all our software solutions. Our FASP protocol replaces TCP-based file transfer technologies (like FTP) enabling our customers to move their data as fast as possible over the internet. Our software solutions are also very flexible, running on existing hardware so organizations don’t need to purchase new network technology. Additionally, our software can be deployed on virtually any infrastructure – on-premises, in the cloud, or in a hybrid environment.
FLG: Aspera is a prominent player and award winner in the media industry. How has that translated into your work with life science organisations?
MO: Aspera had a lot of early traction with media and entertainment companies. They were early adopters of big data; moving HD video files, and images around the world. Huge video files need to be moved from remote film sets, to production teams and special effects houses, and ultimately to homes and theatres. Using unreliable TCP-based solutions or costly hard drive shipments wasn’t cutting it, hence Aspera was a natural solution. Today, we provide the transfer backbone for most big name media companies including Netflix. We even won an Emmy® as “an industry game changer” for our technology in 2013.
In life sciences, teams aren’t typically moving video files, but they experience similar challenges sharing and distributing large research and clinical data over WANs. Genomic data sets in particular have grown dramatically in size over the last decade, with next gen sequencers generating hundreds of gigabytes of data per run. And it’s not just genomic and transcriptomic data, but mass spec, imaging mass cytometry, radiology and other medical imagery, digital pathology whole slide images, and more. Advancements in research technology and high-throughput tools allow researchers to produce large volumes of incredibly rich and diverse sets of biological data that often require integration with other data types for a “systems” level analysis. Bigger data has quickly become the new norm.
The field is also unique in its collaborative nature. These massive data sets are often produced at sequencing sites or labs in one part of the country and shared with teams around the world to further research and develop new treatments. Non-profits, academics, hospitals, labs, and research teams all over the globe need to be able to share data to advance critical discovery and life-saving patient treatment. Additionally, bottlenecks are developing within IT. A lot of labs and hospitals don’t have the storage space or compute power to manage and process this growing volume of data, so they are considering fully or partly moving their large data to the cloud. It’s those challenges in moving data that get at the core of Aspera’s mission and represent what we’ve built our success by solving.
FLG: Which success are you most proud of in genomics and, more broadly, in healthcare & life sciences?
MO: One success that comes to mind is our work with EMBL (European Molecular Biology Laboratory). They are one of the world’s leading research institutions. Researcher teams all over the world send their biological samples to EMBL’s facilities in Europe for sequencing. The output from these sequencers are huge files, upwards of 30 gigabytes per sample. EMBL was using FTP and physical shipments to send this data to researchers, but it was extremely slow and unreliable, sometimes taking days to deliver. When they tested Aspera, they immediately saw transfer speeds 100 times faster while maintaining tight security. This significantly helped accelerate crucial research. Now EMBL moves up to 10,000 terabytes of genomics data a year with Aspera.
Personally, I’m also very proud of our work with UPMC (University of Pittsburgh Medical Center). They’re a big hospital system in the USA, with 60,000 employees, and collaborations around the world. They have a pathology consult program with KingMed Diagnostics in China that involves researchers and physicians using a server in China to share large whole slide images. Unfortunately transfer speeds weren’t very good, at only 2 to 3Mbps. When they tried our solution, they immediately saw a 40x improvement in transfer speeds, and were able to leverage the patented capability of the Aspera FASP protocol to set parameters for transfers so as not to congest their network. As a result, consults on patient turnarounds were reduced to under 24 hours. That’s huge when you’re talking about patient care.
Another exciting and emerging area for us is cloud-based bioinformatics. Bluebee, a recent Aspera customer, is a commercial provider of a SaaS bioinformatics platform that accelerates the processing and analysis of large genomic data sets. Their solution provides the scalability and accessibility of the cloud which is great when dealing with big data. However, getting this large genomic data into the cloud is a big problem. Their customers were uploading up to 360GB of data per patient. FTP and other transfer solutions were impractical, resulting in missed SLAs (Service-Level Agreements) and delayed research. They tested Aspera and immediately saw improvements of 35%, even on the slowest networks. Aspera is now integrated into their platform providing their customers with the fastest possible upload speeds and a seamless user experience.
FLG: This is still a field that’s getting used to the cloud; in my experience, you get some groups who embrace the new technology and understand it and the benefits of it, but there is also some hesitation around the cloud in some quarters. From where you come at it, how do you approach that kind of conversation with people who are new to it and perhaps need to have it all explained in a way that they can feel comfortable with it?
MO: Our software is very flexible, so ultimately we’re indifferent. If you want your data in the cloud that’s not a problem. If you want it in a local datacentre, we accommodate that as well. I think in the life sciences space, organizations are starting to consider the cloud because they’re hitting infrastructure limitations, be it storage or compute power, and they’re seeing bottlenecks in research or patient diagnosis, especially with NGS data. They need the turnkey scalability and elasticity of the cloud. We try to be supportive and say, “What are you trying to achieve?” We discuss how the cloud may help them, but it’s ultimately their decision.
The other concern with the cloud is security and meeting HIPAA compliance and keeping patient health information protected. 2016 was a record setting year for HIPAA fines, so there’s a challenge here to address that’s relevant to all sides of the equation, down to the individual as a healthcare recipient. The major cloud providers have made and continue to make big investments in security to help address some of these concerns.
Data transfer is only one piece of data security, but Aspera has invested significantly to provide a highly secure solution. Our software encrypts data during transit and at rest, performs data integrity and verification checks, and provides robust user authentication and admin controls. It is an imperative that our customers are able to move their sensitive and valuable data with confidence and as little risk as possible, regardless of whether the transfer is to the cloud or to a peer or data repository in another location.
FLG: Who’s your ideal customer in the life sciences space?
MO: Our ideal customer is literally any individual, academic, clinical or commercial organization that needs to securely distribute, move, and/or synch gigabytes or more of data between dispersed locations in a reliable and/or time-critical fashion.
With flexible deployment modes, we serve the individual academic researcher performing sporadic data transfers to the high performance computing centres or collaborative research data portals. We also work incredibly well for service labs in support of accelerating turnaround times and improving data distribution control. We have many industry partners who embed Aspera into their software or hardware technology for high performance cloud ingest, to ultimately improve the experience for their end users.
We believe that especially for the life sciences, we have the proven go-to solution to replace hard drive shipping and slow, unreliable TCP-based data transfers, along with flexible deployment and pricing models to accommodate a range of applications. This is a completely different and smarter way to move and distribute large data sets and we think that this could impact the life sciences industry in a way that will further accelerate this space. Progress can’t wait and it’s incredibly rewarding to be a part of this industry’s innovative and impactful work.