Having gotten in on the ground floor following the seminal Wilkinson (2016) publication regarding usability and access to scientific data, I’ve worked extensively in ontology curation, data reporting standardization and validation, machine and human readable term harmonization for cross-omics data integration, and general leveraging of FAIR concepts.


Findable, Accessible, Interoperable, and Reusable (FAIR) data as a formal concept is relatively new, but has been leveraged by scientists for decades. Linking together transcriptomics with proteomic expression, combining multiple independent runs of an identical assay for statistical analysis, and more through common terminologies. I have worked for the Library of Integrated Network-based Cellular Signatures (LINCS) and Illuminating the Druggable Genome (IDG) establishing data reporting standards based on ontological controlled vocabularies coupled with automated validation of assay and reagent data, manual curation of data reports for later integration, and more. Whether you need assistance in establishing experimental FAIR practices and data standards from planning to reporting or finding more data to fit what your group is doing, I can help.

  • BioPortal-centric ontology design, publication and maintenance along with use for controlled-vocabulary centric data standardization and schema design
  • Extensive use leveraging JSON-LD and XML for hierarchical ontology and metadata term storage and use
  • Assay and reagent data curation, integration, database management, ETL, standards development and enforcement, and establishing Data Submission System (DSS) software platform for LINCS
  • Established and enforced semantically-defined metadata standard specifications and Resource Submission System (RSS) software platform for IDG
  • Effectively integrate and uniquely analyzed data from FDA/EPA Tox21 project reporter gene assays through chemical structure machine learning, deep metadata annotation analysis, and cross-dataset virtual screening
  • Enrichment of analytical product development through extract-transform-load (ETL) of high-quality, ontology centric gene and protein expression localization data, gold standard compound curation, client report generation, and contract work supervision
  • Project manager for data curation contract of small molecule indications, side effects, targets, identifiers, interactions, and more
  • Curator for Bio Assay Ontology (BAO), Drug Target Ontology (DTO), and Regenbase, along with establishing the Ontology for Stem Cell Investigation (OSCI)

Publications

  • Cooper DJ, Koleti A, O’Connor MJ, Willrett D, Chung C, Greybeal J, Musen M, Schürer S. FAIR LINCS Metadata powered by the CEDAR Modular Framework and Validation. In prep
  • Cooper D.J., Koleti A., Vidovic, D., Stathias, V., Schurer, S. Formalization of LINCS metadata standards through JSON schema organization and ontological definition. In prep
  • Clarke, D. J. B., Wang, L., Jones, A., Wojciechowicz, M. L., Torre, D., Jagodnik, K. M., Cooper D., … Ma’ayan, A. (2019). FAIRshake: Toolkit to Evaluate the FAIRness of Research Digital Resources. Cell Systems. Cell Press. https://doi.org/10.1016/j.cels.2019.09.011
  • Cooper, D. J., & Schürer, S. (2019). Improving the Utility of the Tox21 Dataset by Deep Metadata Annotations and Constructing Reusable Benchmarked Chemical Reference Signatures. Molecules, 24(8), 1604. Retrieved from https://www.mdpi.com/1420-3049/24/8/1604
  • He, Y., Duncan, W. D., Cooper, D. J., Hansen, J., Iyengar, R., Ong, E., . . . Diehl, A. D. (2019). OSCI: standardized stem cell ontology representation and use cases for stem cell investigation. BMC Bioinformatics, 20(5), 180. doi:10.1186/s12859-019-2723-7
  • Callahan A, Danzi MC, Zunino G, Cooper DJ, Shah NH, Visser U, Bixby JL, Lemmon VP. Elucidating effects of nerve injury on gene expression using RegenBase, a knowledge base of spinal cord injury biology.  Proceedings of the 19th Bio-Ontologies Special Interest Group, Publication and oral presentation, 7-8-2016; presented by A. Callahan

Invited Talks

  • “Stem cell and cell line domain tutorial” and “Standardization and integration of stem cell line information in the national LINCS research consortium”, Workshop on Ontologies for Stem Cell and Stem Cell Line Cells, oral presentations, April 2018, Ann Arbor, MI
  • “LINCS-CEDAR: Collaboration, Integration, and Validation”. National Cancer Institute software integration demonstration of Center for Enhanced Data Annotation and Retrieval (CEDAR) platform, March 2018. Stanford, CA (remote presentation and demo from Miami)

Contact for more info