Getting started with scholidonline

scholidonline provides online utilities for working with scholarly identifiers. It builds on scholid for structural detection and normalization, and adds registry-backed functionality such as:

This vignette introduces the interface and typical workflows when working with registry-connected identifier data.

Installation

install.packages("scholidonline")

Interface

scholidonline exposes a small set of user-facing functions:

Supported identifier types

You can inspect which identifier types are supported:

scholidonline::scholidonline_types()
#> [1] "arxiv" "doi"   "orcid" "pmcid" "pmid"

Inspecting capabilities

scholidonline is registry-driven. You can inspect all supported operations, conversions, and providers:

out <- scholidonline::scholidonline_capabilities()
knitr::kable(out)
type operation target providers default_provider
arxiv exists NA auto, arxiv arxiv
arxiv links NA auto, arxiv arxiv
arxiv meta NA auto, arxiv arxiv
doi exists NA auto, doi.org, crossref doi.org
doi links NA auto, crossref crossref
doi meta NA auto, crossref, doi.org crossref
doi convert pmid auto, ncbi, epmc ncbi
doi convert pmcid auto, ncbi, epmc ncbi
orcid exists NA auto, orcid orcid
orcid links NA auto, orcid orcid
orcid meta NA auto, orcid orcid
pmcid exists NA auto, ncbi, epmc ncbi
pmcid links NA auto, ncbi, epmc ncbi
pmcid meta NA auto, ncbi, epmc ncbi
pmcid convert pmid auto, ncbi, epmc ncbi
pmcid convert doi auto, ncbi, epmc ncbi
pmid exists NA auto, ncbi, epmc ncbi
pmid links NA auto, ncbi, epmc ncbi
pmid meta NA auto, ncbi, epmc ncbi
pmid convert doi auto, ncbi, epmc ncbi
pmid convert pmcid auto, ncbi, epmc ncbi

Existence checks: id_exists()

id_exists() verifies whether identifiers exist in their respective registries.

scholidonline::id_exists(
  x    = "10.1000/182",
  type = "doi"
)

If type = NULL, the type is inferred automatically:

scholidonline::id_exists(
  x = c(
    "10.1000/182",
    "12345678"
  )
)

Return values:

Conversion: id_convert()

Many scholarly identifiers are cross-linked across systems.

Common examples:

scholidonline::id_convert(
  x    = "12345678",
  from = "pmid",
  to   = "doi"
)

If from = NULL, the source type is inferred per element:

scholidonline::id_convert(
  x = c("12345678", "PMC1234567"),
  to = "doi"
)

Unresolvable mappings return NA_character_.

Metadata retrieval: id_metadata()

id_metadata() retrieves harmonized metadata from external registries.

out <- scholidonline::id_metadata(
  x    = "10.1038/nature12373",
  type = "doi"
)
knitr::kable(out)

Metadata completeness depends on the registry.

You can restrict returned fields:

out <- scholidonline::id_metadata(
  x = "10.1038/nature12373",
  type = "doi",
  fields = c("title", "year", "doi")
)
knitr::kable(out)

Working with mixed data

A common workflow for messy identifier columns:

  1. Detect identifier types (via scholid)
  2. Normalize identifiers
  3. Check registry existence

Example:

x <- c(
  "https://doi.org/10.1000/182",
  "PMCID: PMC1234567",
  "not an id"
)

types <- scholid::detect_scholid_type(x)

x_norm <- rep(NA_character_, length(x))

for (i in seq_along(x)) {
  if (is.na(types[i])) {
    next
  }

  x_norm[i] <- scholid::normalize_scholid(
    x = x[i],
    type = types[i]
  )
}

types
x_norm

scholidonline::id_exists(x)

Provider selection

Most functions accept a provider argument.

scholidonline::id_exists(
  x        = "10.1000/182",
  type     = "doi",
  provider = "crossref"
)

scholidonline::id_exists(
  x        = "10.1000/182",
  type     = "doi",
  provider = "doi.org"
)

If provider = "auto" (default), a sensible registry is chosen automatically, potentially with fallback behavior.

Available providers depend on the identifier type and operation. Use scholidonline_capabilities() to inspect them.

The chosen provider affects:

Scope of scholidonline

scholidonline focuses on identifiers that have:

Examples:

Other identifiers (e.g., ISBN, ISSN) are structurally supported by scholid, but do not always have stable, open registry APIs.