Co-built Scimagine's research data platform end-to-end — web application, backend services, and storage layer — for materials and experimental sciences. Delivered schema-flexible ingestion of heterogeneous instrument output, hyper-annotation over raw datasets without modifying them, and a unified cross-format search index, all aligned with FAIR principles so outputs stay findable, interoperable, and reusable.
The Challenge
Materials-science and experimental groups generate heterogeneous instrument output — images, spectra, point clouds, time-series, lab notebooks — in formats that vary by vendor and protocol. The result is petabyte-class data that sits dormant on institutional servers: hard to search, hard to reuse, and hard to share across teams without losing institutional control.
Our Approach
We co-built the platform end-to-end. A schema-flexible ingestion layer absorbs heterogeneous instrument output and lifts FAIR-aligned metadata without flattening format fidelity. A hyper-annotation layer attaches structured labels on top of raw datasets without modifying them, so cross-experiment query and knowledge mining work against a single index. Around that core sits a full-stack web platform — researcher-facing web application, REST APIs, and cloud backend services — with encryption at rest and multi-tenant isolation for institutional deployments.
Results
Research data that previously sat siloed becomes searchable, annotated, and AI-ready while staying under the originating institution's control. The platform serves individual researchers and institutional tenants under an open-source-first strategy aligned with the Leiden Declaration of FAIR Digital Objects.
