Privacy and data#
The SDK captures interactions, not values. The boundary is deliberately broad and enforced at the source of capture, not at the ingest endpoint.
What is masked by default#
- Inputs are never read. Form fields (
<input>,<textarea>,<select>) and anycontenteditableregion are sensitive; their values do not leave the browser. - Form submits carry metadata only. Structure (
form_id,form_name,action,method,field_names[],field_types[],field_count), never values. - Every form-entry element is sensitive by tag, not by type. Every
<input>(regardless of itstype),<textarea>, and<select>is treated as sensitive, so no text or value is read from it. There is no type allowlist: a plaintextornumberfield is masked exactly like apasswordfield. As an extra layer, the$changeinteraction event skipspassword,file, andhiddeninputs entirely. - Click fingerprints on sensitive targets are redacted. Tag, role, and a fragile selector survive;
text,aria_label, andtitledo not. - Container text never leaks child input values. When fingerprinting a non-sensitive container, the visible-text walker skips any sensitive descendant, so a card's
innerTextcannot include a child input's value.
data-revu-mask#
Add the attribute to any element (or any ancestor) to mark its subtree sensitive. The SDK honors it everywhere a sensitive element would be honored:
- Click fingerprints inside the subtree redact text,
aria-label, andtitle. - Form submits inside the subtree skip field-name capture entirely.
- Container text extraction skips the subtree.
<!-- Mask a PII summary card -->
<aside data-revu-mask>
<h3>Account balance</h3>
<p>$1,234.56</p>
</aside>
<!-- Mask one field on a form (or the whole form) -->
<form data-revu-mask>
<input name="ssn" type="text" />
</form>
The attribute also crosses Shadow DOM boundaries: a data-revu-mask on
a custom element's host applies to every element in its shadow tree.
data-revu-mask still emits the interaction, just with its labels
redacted. When you want no event at all for a region - not even a
redacted one - use the
autocaptureDenySelectors
config option instead; it suppresses the event entirely, including any
file-download / outbound-link / rage events derived from the click.
URLs and query strings#
Captured URLs (the $pageview url, the referrer, and $outbound_link /
$file_download targets) routinely carry secrets in their query string or
fragment: a password-reset token, an email address in ?email=, an
OAuth/OIDC implicit-flow #access_token=.... The SDK redacts the
values of sensitive parameters at the source - in both the query and
the fragment - replacing them with [redacted] before the event is built.
The redaction is by parameter name, not wholesale, because the server
derives campaign attribution (UTM, click ids) from the captured URL. So
utm_source, utm_medium, gclid, fbclid, and other attribution and
benign params are preserved, while token, password, secret, auth,
api_key, session, email, and similar credential / PII keys (matched
case-insensitively, including _/--delimited variants like
access_token and user_email) have their values stripped.
This is redaction at source, not a toggle: there is no option to capture
raw query values. The page identity (screen / path) is the pathname
plus hash and never includes the query string; the hash itself is run
through the same redaction, so a credential-bearing fragment (an OAuth
implicit-flow #access_token=... landing) never lands in screen or
path. Hash-router routes (#/pricing) and anchors (#section) are
preserved unchanged.
What the SDK does not parse client-side#
By design, several categories of work live server-side:
- URL query parsing. Campaign attribution (UTM, click ids) is derived server-side from the
$pageviewURL on the first event of each session. The SDK does not ship a parser for these. - User agent parsing. The SDK ships the raw
navigator.userAgentstring; the server parses it into os, browser, and device. UA strings drift; the server can iterate on the parser without a customer redeploy. - IP-based geo. The SDK never reads or sends client geolocation. The server enriches based on the request's IP, which is also more durable than client APIs and never requires a permission prompt.
This is a hard boundary, not a temporary state. Anything that would require shipping a dictionary, an algorithm, or a model to the browser stays server-side. That is what keeps the bundle in single-digit kilobytes.
Consent#
Capture is gated on a per-category consent state the host controls at runtime, so a cookie banner routes its choices through the SDK rather than wrapping every call in a check. The simplest form is the binary master switch:
revu.optOut(); // stop all capture (reject / withdraw consent)
revu.optIn(); // resume capture (accept)
revu.hasOptedOut(); // -> boolean
For per-category control, use the consent API. There are three
categories - analytics, marketing, and functional - each
"granted" or "denied":
revu.consent.set({ analytics: "granted", marketing: "denied" });
revu.consent.get();
// -> { analytics: "granted", marketing: "denied", functional: "granted" }
Only analytics gates capture: while it is denied, every interaction
(autocapture, pageviews, custom capture() calls, identity events) is
suppressed before an event is built, so nothing leaves the browser.
optOut() / optIn() are aliases for denying / granting it.
marketing and functional are declarative: the SDK does not act on
them, it stamps the full state on every event (context.consent) so
the server honors the visitor's choices on the destinations downstream.
The choice is persisted in the same first-party store as identity, so a reload honors it without re-prompting. A binary opt-out persisted by an earlier SDK version is read on the first load after upgrade, so a prior reject keeps being honored.
Changing consent does not clear identity: granting again resumes the
same visitor. That is the right default for a consent toggle (a user who
re-accepts is the same person). Call revu.reset() if you instead want
a clean break to a new anonymous visitor.
For per-element opt-out, use data-revu-mask on the subtree.
Global Privacy Control#
Some browsers advertise a Global Privacy Control signal
(navigator.globalPrivacyControl). The SDK always stamps it on events as
context.gpc so the server sees it. With honorGpc: true (off by
default), a GPC signal also defaults the analytics category to denied,
unless the visitor has already made an explicit choice through your
banner - an explicit choice always wins. The default is off because
whether GPC legally requires suppression depends on your jurisdiction
(it is a valid opt-out signal under CCPA/CPRA, but not the consent
mechanism under GDPR), so the decision is left to you.
Dropping locally-buffered events#
optOut() stops new capture but leaves events already queued under prior
consent to flush. To also discard any locally-buffered events and stored
ids for a user who withdraws consent, clear the durable queue and
identity stores:
revu.optOut();
// Keep `revu_consent` so the opt-out itself is honored on the next load.
const keys = [
"revu_event_queue",
"revu_anonymous_id",
"revu_user_id",
"revu_session_id",
"revu_session_last_seen",
"revu_attribution_first",
"revu_attribution_last",
];
try {
for (const key of keys) localStorage.removeItem(key);
} catch {}
for (const key of keys) {
document.cookie = `${key}=; Path=/; Max-Age=0; SameSite=Lax`;
}
A server-side right-to-be-forgotten helper that also purges already-ingested events is planned; until then, the above fully disables capture and clears local state.