GENCROBAT
GENCROBAT is a computational system for efficient transmission of genetic data produced by high-throughput DNA sequencing equipment to cloud computing service providers, where the data is processed and turned into information. The transmission not only helps huge sequencing data to be transferred efficiently over the Internet, but also generates highly useful information to be used in the further steps of analysis on the cloud side.A simple analogy to describe what GENCROBAT does is, it acts as an acrobat carrying the DNA sequencing data from where it is produced to where it will be processed through a tiny rope, the communication channel. The volume it needs to transport from one point to another is so huge that it needs to travel over that rope many times. Thus, it aims to carry as much as the rope lets at each pass. To achieve this, before beginning to carry the raw data, it puts them into special vacuum bags where the goods are squeezed to carry more at each iteration. However, only the related items are packed in the vacuum bags, and thus, when a package arrives to the destination, the destination has knowledge about what is inside and where to put it. The packaging of the volumes into vacuum bags surely takes time, but since this is done in a novel proper way, the time spent on packaging is gained back on the destination during the further operations carried out there.GENCROBAT offers an innovative solution to re-organize the processing pipeline beginning from data generation till the end of the final analysis. It assumes the life science center as a part of the big cloud, and begins with preprocessing of data on the sender side. This preprocessing not only improves the compression ratio and speed but also generates useful knowledge that will help on the cloud side.

Connecting sequencing centers to cloud
GENCROBAT platform aims to bridge the gap between sequencing centers and cloud computing services to enhance data analysis time and quality.THE CHALLENGE
The high-throughput DNA sequencing equipments produce huge volumes of data with an ever increasing rate. While the sequencing cost decreases by ≈5x per year, the cost of computing at best decreases by ≈2x. Very soon, interpreting the omics data will cost much more than generating it.Cloud computing services dedicated to bioinformatics seems promising to keep pace with that massive sequencing data generation. Thus, nowadays we hear a lot on cloud computing services providing speedy sequence analysis.GENCROBAT offers an innovative solution to re-organize the processing pipeline beginning from data generation till the end of the final analysis. It assumes the life science center as a part of the big cloud, and begins with preprocessing of data on the sender side. This preprocessing not only improves the compression ratio and speed but also generates useful knowledge that will help on the cloud side.Despite the difficulties of processing sequencing data due to its volume and structure, the initial problem is to transmit the data from where it is produced to where it will be processed.Typically, this is uploading the DNA sequencing data from a life science center to a remote cloud computing environment.
