Large-scale -omics data solutions for local and cloud based database storage, efficient access to public repositories, quality control, commercial and academic licensing information, analysis pipeline structuring, and more!


I have over 10 years of experience in large-scale data access, curation, quality control, and analysis, including single-cell and population NGS, proteomics, phenotypic screening analysis, and database design and implementation.

  • Multiple NGS data modality analysis using R (bioconductor) and Python including population and single-cell RNA-seq for variant calling, functional genomics, and basic transcriptomics along with data and pipeline QC and results’ integration
  • Cloud computing and database storage with GCP, Azure, and AWS
  • Data architecture, indexing, and search design and implementation with Solr, Hadoop, and REST
  • PostgreSQL and SQLite relational database design, management, data curation and validation, and ETL using Aquadata, PGAdmin, and PipelinePilot
  • GitHub, Trello, JIRA, and Basecamp based project tracking for issues, code publication, etc.
  • Natural language processing through Python of PubMed-published data to determine specific candidate datasets for analysis projects

Publications

  • Stathias, V., Turner, J., Koleti, A., Vidovic, D., Cooper, D., Fazel-Najafabadi, M., . . . Schürer, S. C. (2019). LINCS Data Portal 2.0: next generation access point for perturbation-response signatures. Nucleic Acids Research. doi:10.1093/nar/gkz1023
  • Cooper, D. J., & Schürer, S. (2019). Improving the Utility of the Tox21 Dataset by Deep Metadata Annotations and Constructing Reusable Benchmarked Chemical Reference Signatures. Molecules, 24(8), 1604. Retrieved from https://www.mdpi.com/1420-3049/24/8/1604
  • Stathias, V., Koleti, A., Vidović, D., Cooper, D. J., Jagodnik, K. M., Terryn, R., . . . Schürer, S. C. (2018). Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center. Scientific Data, 5, 180117.
  • Keenan AB, Jenkins S, Jagodnik K,…. Cooper DJ,….. Pillai A. The Library of Integrated Network-Based Cellular Signatures NIH Program: System-Level Cataloging of Human Cells Response to Perturbations (2018).  Cell Systems, 6(1):13-24
  • Koleti A, Terryn R,…. Cooper DJ,….. Schürer SC. Data Portal for the Library of Integrated Network-based Cellular Signatures (LINCS) program: integrated access to diverse large-scale cellular perturbation response data (2018). Nucleic Acids Res. 46(D1):D558-D566
  • Cooper DJ, Zunino G, Bixby JL, Lemmon VP. Phenotypic screening with primary neurons to identify drug targets for regeneration and degeneration (2017).  Mol Cell Neurosci, 80:161-169

Contact for more info