The Common Fund Data Ecosystem (CFDE) has created a flexible system of data federation that enables researchers to discover datasets from across the US National Institutes of Health Common Fund without requiring that data owners move, reformat, or rehost those data. This system is centered on a catalog that integrates detailed descriptions of biomedical datasets from individual Common Fund Programs' Data Coordination Centers (DCCs) into a uniform metadata model that can then be indexed and searched from a centralized portal. This Crosscut Metadata Model (C2M2) supports the wide variety of data types and metadata terms used by individual DCCs and can readily describe nearly all forms of biomedical research data.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
January 2022
Defining the structural and functional changes in the nervous system underlying learning and memory represents a major challenge for modern neuroscience. Although changes in neuronal activity following memory formation have been studied [B. F.
View Article and Find Full Text PDFDatabase evolution is a notoriously difficult task, and it is exacerbated by the necessity to evolve database-dependent applications. As science becomes increasingly dependent on sophisticated data management, the need to evolve an array of database-driven systems will only intensify. In this paper, we present an architecture for data-centric ecosystems that allows the components to seamlessly co-evolve by centralizing the models and mappings at the data service and pushing model-adaptive interactions to the database clients.
View Article and Find Full Text PDFProc IEEE Int Conf Escience
October 2017
Creating and maintaining an accurate description of data assets and the relationships between assets is a critical aspect of making data findable, accessible, interoperable, and reusable (FAIR). Typically, such metadata are created and maintained in a data catalog by a curator as part of data publication. However, allowing metadata to be created and maintained by data producers as the data is generated rather then waiting for publication can have significant advantages in terms of productivity and repeatability.
View Article and Find Full Text PDFProc IEEE Int Conf Escience
October 2017
The pace of discovery in eScience is increasingly dependent on a scientist's ability to acquire, curate, integrate, analyze, and share large and diverse collections of data. It is all too common for investigators to spend inordinate amounts of time developing ad hoc procedures to manage their data. In previous work, we presented Deriva, a Scientific Asset Management System, designed to accelerate data driven discovery.
View Article and Find Full Text PDF