A focused, spec-complete implementation of the Public Suffix List (PSL) for R. pslr bundles a reproducible, pinned PSL snapshot and implements the official prevailing-rule algorithm to answer public-suffix (eTLD) and registrable-domain (eTLD+1) queries.
- Distinguishes the ICANN and PRIVATE rule sections.
- Accepts Unicode, ASCII, and A-label hostnames via
punycodercanonicalization; returns ASCII or Unicode output. - Works fully offline from the bundled snapshot; an explicit, validated
psl_refresh()is the only network path. - Matcher compiled with
cpp11; no external system library required.
Installation
Install from GitHub:
# install.packages("pak")
pak::pak("bart-turczynski/pslr")pslr depends on punycoder, which is installed automatically from CRAN.
Usage
library(pslr)
public_suffix("www.example.co.uk") #> "co.uk"
registrable_domain("www.example.co.uk") #> "example.co.uk"
# ICANN vs PRIVATE sections
public_suffix("user.github.io") #> "github.io"
public_suffix("user.github.io", section = "icann") #> "io"
# Explicit membership vs the implicit default rule
is_public_suffix("madeuptld") #> TRUE (implicit "*")
is_public_suffix("madeuptld", unknown = "na") #> NA (explicit only)
# Split a host, or inspect the prevailing rule
suffix_extract("blog.user.github.io")
public_suffix_rule("a.b.kobe.jp")See vignette("introduction", package = "pslr") for the full tour: section choice, the unknown-suffix policy, IDN output, terminal dots, refresh and activation, reproducibility, and security notes.
Reproducibility
A result depends on both which list answered and how hosts were normalized. psl_version() reports the active-list provenance plus the runtime normalization identifiers; record it alongside reproducibility-sensitive output.
Development
Install dependencies plus the dev tooling used by the checks:
Run the same verification CI runs (lint + R CMD check --as-cran):
Rscript -e 'lints <- lintr::lint_package(); if (length(lints)) { print(lints); quit(status = 1) }' && Rscript -e 'rcmdcheck::rcmdcheck(args = "--as-cran", error_on = "warning")'R CMD check runs the testthat and cucumber specs, so the behaviour specs are verified as part of the check. A non-CRAN performance benchmark and its release gate live in bench/benchmark.R; recorded reference results are in docs/benchmarks.md.
Project layout
-
R/— package source (edit roxygen comments here, notman/orNAMESPACE). -
src/— thecpp11matcher core. -
man/— generated help pages (devtools::document()). -
tests/testthat/— testthat tests and cucumber feature specs. -
vignettes/— long-form documentation. -
data-raw/— the deterministic snapshot regeneration pipeline. -
docs/— durable project context (PRD.md,architecture.md,benchmarks.md).
Related packages
pslr is part of a small ecosystem of R packages by the same author:
-
punycoder — the Punycode and IDNA codec that
pslruses for host canonicalization before PSL matching. Use it directly for raw Unicode ↔︎ ACE round-trips. -
rurl — full URL parsing, normalization, cleaning, and joining toolkit. Uses
pslras its PSL engine; reach for it when you need more than domain extraction.