A Guide to Biodata: Analysis, Integration and Management
It has now been 16 years since the Human Genome Project sequenced the first ever human genome: in that time, as has been mentioned in previous editions of this report, more than 500,000 genomes have been sequenced, generating enough data to drastically raise computational resource usage and create a need for rapid innovation to offset the expensive storage and computational hardware required to sufficiently hold and use this data.
This innovation has seen decreases in computing cost, increased availability of sophisticated technology and improved hardware efficiency, and particularly a shift to online storage and analysis, where more space can be maintained for a fraction of the cost of in-house storage. It is now easier than ever to begin working within the genomics sphere, and use the wealth of data that for a long time has remained dormant and unusable.
This third edition of what was the “Genomics Data 101” report has undergone a significant re-branding for a reason: put simply, the title no longer fit the content. The time that has passed since 2003 has not only seen the rapid advancement of genomics technology: it has also seen a sector increasingly accustomed to the foundational knowledge required to manage and use biodata. As such this is no longer a 101: while this guide does contain all the information that individuals new to genomic data will need to know, it has been expanded and deepened considerably to include information that will aid not only newcomers to the sector but those who already have a solid understanding of the basics. With new chapters on data discoverability, integration, compression and machine learning, this report more than ever before provides a comprehensive understanding of what biodata is, where it can be found and how best it can be used.
Readers will learn:
- the process of generating, gathering and analysing genomic data, from beginning to end
- the latest innovations and need-to-know resources in data storage, compression and discoverability
- the different types of AI in genomics, their applications, and the biggest privacy challenges they present
We hope as always that this guide proves both informative and educational, not only to beginning geneticists but to anyone who chooses to read it.