ISO Genomic Data Standardization

GenomSys is working with a group of international experts in the fields of information theory, data compression and bioinformatics for the development of an open standard specification that will be maintained and supported by ISO. Products compliant with this standard will enable customers to create truly interoperable eco-systems for genomic data storage, processing and transport. The work of standardization is already ongoing and it is currently taking place within the ISO/IEC JTC1/SC29/WG11 working group (MPEG) in official liaison with ISO/TC 276 WG 5 (Biotechnology / Data Processing and Integration).

The development of Next Generation Sequencing (NGS) technologies enable the usage of genomic information as everyday practice in several fields, but the growing volume of data generated becomes a serious obstacle for a wide diffusion. The lack of an appropriate representation and efficient compression of genomic data is widely recognized as a critical element limiting its application potential. Beside compression which is at the base of any efficient processing of genomic information, there are several other requirements that the current data formats for raw and aligned data do not fulfill.

ISO/TC 276 works on standardization in the field of biotechnology processes that include analytical methods (Working Group 3) and data processing and integration (Working Group 5).

ISO/IEC JTC 1/SC 29/WG 11 (MPEG) has the mission to develop standards for coded representation and compression of digital audio and video and related data. In its 28 years of activity MPEG has developed many generations of video and audio compression standards.

MPEG genome compression
ISO/TC 276/WG5 (Data Processing and Integration) and MPEG have combined their respective expertise and missions and are jointly working to develop a new compression standard capable of providing new effective solutions for genomic information processing applications.

ISO/IEC is providing the framework for the development of an open genome compression standard based on the following steps:

  • An open public call to collect relevant technologies satisfying all or a subset of identified requirements. The call has been jointly issued in June 2016.
  • The assessment of the performance of the submitted technologies. The activity has started in October 2016 and is currently ongoing.
  • The selection and integration of the best performing technologies into a platform, called “General Model” for the evaluation and verification of performance and the validation of requirement fulfillment.
  • The progressive improvement of the integrated technologies by a process of open experiments (“core experiments”).
  • The approval of the standard by progressive stages as established by the official ISO/IEC procedure.
  • The publication of a normative and informative specification in the form of text and reference software providing the appropriate support for an open standard on which build interoperable genomic information processing applications.
Standardization workplan
  • Draft CfP Issued – San Diego, U.S.A., 18th March 2016
  • CfP Issued – Geneva, Switzerland, 3rd June 2016
  • CfP Responses: Deadline for Submissions 12th October 2016.
  • Technology identification and definition of Core Experiments – Chengdu, China, 21st October, 2016
  • General Model 0 and preliminary Working Draft – January 2017
  • Committee Draft – October 2017
  • Draft International Standard – October 2018
  • Final Draft International Standard – April 2019