In an episode of Amazon Web Services’ This Is My Architecture, Senior Director of Engineering for Station X, Anish Kejariwal gives us a rundown of their platform GenePool. In his words, the platform is for ‘analysing, visualising, and managing genomic data.’

Using Qubole, GenePool runs software called Hive to convert raw sequencing data into ORC files that are partitioned by chromosome and sorted by genomic coordinate. Once organised, users can utilise Presto, also run through Qubole, to search their genomic data with near-real time queries.