Section: Application Domains
Application fields
Our methods are applied in several fields of molecular biology.
Our main application field is marine biology, as it is a transversal field with respect to issues in integrative biology, dynamical systems and sequence analysis. Our main collaborators work at the Station Biologique de Roscoff. We are strongly involved in the study of brown algae: the meneco, memap and memerge tools were designed to realize a complete reconstruction of metabolic networks for non-benchmark species [77] , [64] . On the same application model, the pattern discovery tool protomata learner combined with supervised bi-clustering based on formal concept analysis allows for the classification of sub-families of specific proteins [61] . The same tool also allowed us to gain a better understanding of cyanobacteria proteins [3] . At the larger level of 4D structures, classification technics have also allowed us to introduce new methods for the characterization of viruses in marine metagenomic sample [18] . Finally, in dynamical systems, we use asymptotic analysis (tool pogg) to decipher the initiation of sea urchin translation [49] . We are currently in two new applications in this domain: the team participates to a Inria Project Lab program with the Biocore and Ange Inria teams, focused on the understanding on green micro-algae; and we are involved in the deciphering of phytoplancton variability at the system biology level in collaboration with the Station Biologique de Roscoff (ANR Samosa).
In micro-biology, our main issue is the understanding of bacteria living in extreme environments, mainly in collaboration with the group of bioinformatics at Universidad de Chile (funded by CMM, CRG and Inria-Chile). In order to elucidate the main characteristics of these bacteria, we develop efficient methods to identify the main groups of regulators for their specific response in their living environment. To that purpose, we use constraints-based modeling and combinatorial optimization. The integrative biology tools meneco bioquali, ingranalysis, shogen, lombarde were designed in this context [6] . In parallel, in collaboration with Ifremer (Brest), we have conducted similar work to decipher protein-protein interactions within archebacteria [75] . Our sequence analysis tool (logol) allowed us to build and maintain a very expressive CRISPR database [10] [48] .
Similarly, in animal biology, our goal is to propose methods to identify regulators of very complex phenotypes related to nutritional issues. In collaboration with researchers from Inra/Pegase and Inra/Igeep laboratories, we develop methods to distinguish the response of cows, chicken or porks to different diaries or treatments [40] and characterize upstream transcriptional regulators for such a response [53] , with relevant applications in porks [24] , [37] . The pattern matching tool logol also allows for a fine identification of transcription factor motifs [63] [48] . Constraints-based programming also allows us to decipher regulators of reproduction for pea aphids [69] , [92] . Semantic-based analysis was useful for interpreting differences of gene expression in pork meat [67] .
We are less involved in bio-medical applications as the models and data studied in this application field are well informed and rather data-driven. In collaboration with Institut Curie, we have studied the Ewing Sarcoma regulation network to test the capability of our tool bioquali to accurately correct and predict a large-scale network behavior [45] . Our ongoing studies in this field focus on the exhaustive learning of discrete dynamical networks matching with experimental data, as a case study for modeling experimental design with constraints-based approaches. To that purpose, we collaborate with J. Saez Rodriguez group at EBI [89] and N. Theret group at Inserm/Irset (Rennes) [42] . The dynamical system tools caspo and cadbiom were designed within these collaborations. Ongoing studies focus on the understanding of the metabolism of xenobiotics (mecagenotox program) and the filtering of sets of regulatory compounds within large-scale signaling network (TGFSysBio project).