PID4NFDI Technical Service Components Documentation

An overview of the technical building blocks that make up the PID4NFDI service portfolio

This page documents the individual technical service components that together form the PID4NFDI Coordination Hub. Each component addresses a distinct aspect of PID adoption, metadata quality, and infrastructure interoperability within NFDI.

Contents

PID4NFDI Coordination Hub Website
PID Selection Tool
PID Cookbook
PIDINST Search
ePIC Prefix Registration
B2INST
Data Type Registry
PID Meta Resolver
CAT – Compliance Assessment Toolkit
DataCite Schema 4.6 Namespace
Crosswalks for Semantic Interoperability
Early Lifecycle Semantic Integration
Zotero Library

PID4NFDI Coordination Hub Website

The PID Coordination Hub is the central technical and strategic heart of the PID4NFDI project. It is primarily realized through a dedicated website that serves as the central entry point to PID4NFDI's services and knowledge base. This platform consolidates technical, organizational, and strategic measures to enhance metadata quality and interoperability across the NFDI landscape. Through the website, repository managers, infrastructure providers, and researchers gain direct access to a modular service portfolio, including technical tools, governance guidelines, best practices, and comprehensive training materials such as FAQs and toolkits. By ensuring continuous technical maintenance and editorial development of the website, the Hub functions as a consolidated knowledge base that fosters exchange between NFDI consortia and optimizes the standardized use of PIDs across disciplines by integrating concrete use cases and technical advancements, such as expanded access to PID-referenced information (e.g. Cookbook).

PID Selection Tool

The PID Selection Tool is an interactive decision-support tool designed to help repository and infrastructure managers identify which PID system best fits their specific use case and integration requirements. It guides users through short statements across four thematic sections – Persistence and Costs, Purpose, Metadata & Interoperability, and Technical Setup and Training – each reflecting a typical consideration when selecting a PID service. Users rate the importance of each statement, and the tool then compares their preferences against expert evaluations to produce a ranked visual overview of suitable PID services. Currently, the tool covers four object-related PID services: DataCite DOI, ePIC Handles (GWDG), URN:NBN, and ARK. While broadly applicable, the tool is primarily oriented toward the German research landscape and the NFDI context.

Technically, the tool is implemented as a client-side HTML and JavaScript application, embedded as a page within the PID4NFDI website, which is built using the Hugo static site generator with the HugoBlox (Research) theme and hosted via GitHub Pages. All processing occurs in the browser without any backend, ensuring a lightweight and privacy-friendly user experience. Development and maintenance take place in a version-controlled GitHub repository.

PID Cookbook

The PID4NFDI Cookbook is a training resource to support researchers, data stewards, and infrastructure providers in understanding and implementing PIDs in research workflows. Its purpose is to explain what PIDs are, why they are essential for FAIR research data management, and how different PID systems can be selected and integrated in practice. The Cookbook is maintained as a version-controlled documentation project in a GitHub repository, built using Sphinx as a static documentation generator, and automatically compiled and hosted via Read the Docs.

PIDINST Search

PIDINST Search is a Flask-based web application designed to support the discovery and exploration of research instruments registered with a PID. It is hosted at the TIB as part of the homepage of the PIDINST working group. The tool was developed as a prototype for integrating emerging PID resource types into PID-based knowledge graphs. It brings together instrument records from DataCite and ePIC (via B2INST) within a single, unified search interface. As well as providing direct PID lookup (including prefix search), it enables users to perform free-text searches in key metadata fields defined by the PIDINST metadata schema. An integrated statistics section allows for further analysis of search result sets using interactive line graphs and frequency tables, with filtering options for examining subsets of the data. Development and maintenance take place in a version-controlled GitLab repository.

ePIC Prefix Registration

The prefix registration service provides NFDI consortia with Handle-based PID prefixes via the ePIC consortium, issued by GWDG as a PID provider. Consortia can request test or production prefixes through an online form, enabling quick onboarding into PID workflows. A prefix enables consortia and organizations to assign PIDs under this namespace. During the PID4NFDI project runtime, free test prefixes and a limited number of free productive prefixes (each prefix allowing registration of up to 50,000 PIDs/year) are available. For long-term or larger-scale use beyond these limits, dedicated contracts with GWDG ensure sustainable operation. The service is embedded in the broader PID4NFDI offering and is supported by guidance such as the PID4NFDI Cookbook.

B2INST

B2INST is a service offered by EUDAT and run by GWDG that supports the registration of research instruments with persistent identifiers (ePIC PIDs). The service is available free of charge to NFDI. It can be used by both institutions and individual researchers, either via a web user interface or an API. The service supports manual registration of single instruments as well as automated assignment of PIDs for large numbers of instruments through institutional workflows that integrate the B2INST API. B2INST implements the PIDINST metadata schema and allows communities to extend this schema with additional metadata fields relevant to their specific use cases. B2INST uses the software InvenioRDM v13.0.

Data Type Registry

The Data Type Registry (DTR) is a service for defining, managing, and publishing machine-actionable "PID information types" that can be used in PID records. It allows communities to register types (such as standard metadata attributes or technical properties), assign them PIDs, and combine them into more complex type structures. The registered PID types can be used to generate JSON schemas via the associated Type API. These schemas can then be used to validate PID-related metadata in other systems. The service will soon be offered as part of the PID4NFDI Coordination Hub to support harmonised but flexible handling of PID information across NFDI services and domains. The DTR has been developed as a component of the European FAIRCORE4EOSC project and uses the Cordra software, which provides a Handle-based digital object repository and API framework.

PID Meta Resolver

The PID Meta Resolver (PIDMR) is a technical service that provides a unified resolution API for multiple PID systems and providers. It accepts a PID as input, detects the corresponding PID system or provider, and exposes standardised resolution methods to retrieve the PID's landing page, associated metadata, or, where supported, the referenced digital object itself. The service offers both a web user interface and machine-accessible APIs, allowing integration into data processing pipelines and other research infrastructures. It has been developed as a component of the European FAIRCORE4EOSC project and will soon be integrated into the PID4NFDI Coordination Hub as a service offering for the NFDI community.

CAT – Compliance Assessment Toolkit

The Compliance Assessment Toolkit (CAT) provides machine-actionable assessment of PID-related services against the EOSC PID policy. It formally encodes the policy requirements and allows for automated compliance testing via APIs, complemented by user interfaces for interactive evaluation and reporting. The CAT tool was developed as part of the European FAIRCORE4EOSC project. Hence, the PID4NFDI project does not operate this service but recommends it for the purpose of compliance testing. It is also planned to design an NFDI PID policy within the next PID4NFDI project phase, which would then be integrated into CAT, enabling future assessments of NFDI services against both EOSC and NFDI PID policies.

DataCite Schema 4.6 Namespace

DataCite Linked Data is a GitHub-based project that publishes a staged linked-data representation of parts of the DataCite Metadata Schema 4.6. Developed as a semantic layer around the existing schema, it provides resolvable JSON-LD resources for schema classes, properties, and controlled vocabularies, together with reusable JSON-LD contexts, a manifest of published resources, and bundled distribution files in JSON-LD, Turtle, and RDF/XML.

The project is designed to make the meaning of DataCite metadata easier to interpret, connect, and reuse across systems, supporting graph-based interoperability while remaining compatible with familiar JSON- and XML-oriented workflows.

In addition to the machine-readable resources themselves, the repository includes human-browsable index pages for the website, generation scripts, and beginner-friendly documentation, and is maintained as a version-controlled GitHub repository to support ongoing review and development.

In collaboration with TS4NFDI, as an incubator project, DataCite's controlled vocabularies are made available through terminology services, enabling users to search and select schema terms directly within host systems.

For each selected term, resolvable URIs and essential semantic context (label, definition, and concept scheme) are provided to support machine-actionable metadata storage.

Official vocabularies, including relationType, resourceTypeGeneral, and other controlled lists, are also supplied as structured JSON/JSKOS concept schemes to facilitate integration into terminology service infrastructures, tools and downstream PID workflows.

Crosswalks for Semantic Interoperability

The SKOS and JSKOS mappings establish machine-actionable crosswalks that align the DataCite Metadata Schema 4.6 with external vocabularies, supporting semantic interoperability across metadata systems and knowledge graph services.

Developed in collaboration with the NFDI Metadata Task Force, this work follows the recommendations of the NFDI Section Metadata, which identifies DataCite, DCAT, and schema.org as the three generic metadata schemas that should serve as primary alignment points across the NFDI landscape.

Using SKOS and JSKOS, the mappings connect DataCite classes, properties, and controlled vocabulary terms to corresponding concepts in DCAT, schema.org, DCTERMS, and related linked data resources. This enables structured reuse in metadata registries, crosswalk services, terminology services, and graph-based workflows.

As part of the PID4NFDI contribution, mapping artefacts are created and curated as versioned JSKOS files and SSSOM format, to support broader interoperability requirements.

Through the TS4NFDI incubator collaboration, these mappings are integrated into Cocoda, where they are stored in JSKOS and can be manually curated to maintain quality and consistency. Cocoda exposes the mappings via a REST API, enabling lookup by source concept and programmatic retrieval by external systems. A pilot integration is underway to load the DataCite alignment sets into Cocoda and test practical retrieval scenarios, including terminology service widgets and metadata transformation workflows.

Early Lifecycle Semantic Integration

This component supports research infrastructures at the earliest stages of the research lifecycle to enable high-quality, standards-aligned metadata capture at the point of creation. In collaboration with RSpace during the TS4NFDI incubator phase, PID4NFDI is introducing machine-actionable semantics directly into the RSpace user interface. The objective is to facilitate the selection of controlled terms, reduce ambiguity in metadata entry, and ensure that RSpace objects can connect reliably to downstream PID infrastructures such as DataCite, RAiD, and related services across the NFDI ecosystem.

The approach builds on components of the Terminology Service Suite (TSS), focusing during the incubator phase on a limited set of high-impact, technically feasible integrations. This includes evaluating whether embedded TSS widgets can return canonical URIs that align with DataCite-compatible workflows and support structured, machine-actionable metadata.

PID4NFDI supports via consultations for identifying targeted insertion points for controlled vocabularies within the RSpace interface and by providing mapping artefacts in JSON, JSKOS, and SSSOM formats to support potential cross-schema exports (e.g., schema.org, DCAT), should RSpace implement export functionality in the future.

October Hackathon identifying core elements for a machine-actionable DMP

Zotero Library

The PID4NFDI Zotero Library is a publicly accessible, collaboratively maintained reference collection curating literature related to Persistent Identifiers and FAIR Research Data Management. It serves as a shared knowledge base for the PID4NFDI team and the broader community, collecting scholarly articles, reports, guidelines, and other relevant publications from across the PID ecosystem. The library is hosted as an open group library on Zotero and can be browsed directly via the Zotero web interface without requiring an account. It is maintained using Zotero's group library functionality, which enables collaborative editing and curation by multiple contributors.