Ping HA subnet routers each probe cycle and mark unresponsive nodes
unhealthy. Reconnecting a node clears its unhealthy state since the
fresh Noise session proves basic connectivity.
Updates #2129
Updates #2902
Track unhealthy nodes in PrimaryRoutes so primary election skips them.
When all nodes for a prefix are unhealthy, keep the first as a degraded
primary rather than dropping the route entirely.
Anti-flap is built in: a recovered node becomes standby, not primary,
because updatePrimaryLocked keeps the current primary when still
available and healthy.
Updates #2129
Updates #2902
- Remove redundant inline button/input styles that duplicate CSS
- Use CSS variables for input (dark mode support)
- Use A(), Ul(), Ol(), P() wrappers from general.go
- Add expandable explanation of what the ping tests
- Fix section spacing rhythm (spaceXL before results, space2XL
before connected nodes)
- Add flex-wrap for mobile responsiveness
Add a reusable <details>/<summary> component to the shared design
system. Styled to match the existing card/box component family
(border, radius, CSS variables for dark mode).
Collapsed by default with a clickable summary line.
Add a lightweight dev tool that starts a headscale server on localhost
with a pre-created user and pre-auth key, ready for connecting real
tailscale nodes via mts.
The tool builds the headscale binary, writes a minimal dev config
(SQLite, public DERP, debug logging), starts the server as a
subprocess, and prints a banner with the server URL, auth key, and
mts usage instructions.
Usage: go run ./cmd/dev
make dev-server
Unit tests for Change (IsEmpty, Merge, Type, PingNode constructor),
ping tracker (register/complete/cancel lifecycle, concurrency, latency),
and end-to-end servertests exercising the full round-trip with real
controlclient.Direct instances.
Updates #2902
Updates #2129
Implement tailcfg.PingRequest support so the control server can verify
whether a connected node is still reachable. This is the foundation for
faster offline detection (currently ~16min due to Go HTTP/2 TCP retransmit
behavior) and future C2N communication.
The server sends a PingRequest via MapResponse with a unique callback
URL. The Tailscale client responds with a HEAD request to that URL,
proving connectivity. Round-trip latency is measured.
Wire PingRequest through the Change → Batcher → MapResponse pipeline,
add a ping tracker on State for correlating requests with responses,
add ResolveNode for looking up nodes by ID/IP/hostname, and expose a
/debug/ping page (elem-go form UI) and /machine/ping-response endpoint.
Updates #2902
Updates #2129
The directory /usr/share/doc/headscale/examples may be used to install
arbitrary example files. This is useful to get a matching configuration
for the release which gets also overwritten automatically.
Replace the bullet list of device details with a two-column table
for cleaner visual hierarchy. Labels are bold and left-aligned,
values right-aligned with subtle row separators. The machine key
value uses an inline code style.
Updates juanfont/headscale#3182
Tighten the SVG viewBox to the actual content bounding box and
remove hardcoded width/height attributes so the browser no longer
adds horizontal padding via preserveAspectRatio. The "h" wordmark
now left-aligns with the page content below it.
Replace the error icon SVG path (which had an off-center X) with
a simple circle + two crossed lines drawn from a centered viewBox.
Both icons now use fill="currentColor" for dark mode adaptation.
Updates juanfont/headscale#3182
Replace hardcoded Go color constants with var(--hs-*) and
var(--md-*) CSS custom properties in externalLink, orDivider,
card, warningBox, downloadButton, and pageFooter. This ensures
all components follow the dark mode theme automatically.
Also switch pageFooter from div to semantic footer element and
simplify externalLink by letting CSS handle link styling.
Updates juanfont/headscale#3182
Bump base font size from 0.8rem to 1rem (16px) to meet mobile
accessibility guidelines and avoid iOS auto-zoom on inputs.
Add CSS custom properties for all theme colors with a
prefers-color-scheme: dark media query so pages adapt to OS dark
mode. Component inline styles reference var(--hs-*) tokens so they
follow the scheme automatically.
Accessibility improvements:
- role="status" + aria-live="polite" on success boxes
- role="alert" + aria-live="assertive" on error boxes
- role="note" on warning boxes
- Visible focus rings via :focus-visible
- Link underlines (don't rely on color alone)
- SVG icons use currentColor for theme adaptation
- prefers-reduced-motion media query
- <main> landmark element wrapping page content
- Button styling with 44px min-height touch target
- List item spacing
Updates juanfont/headscale#3182
Add httpUserError() alongside httpError() for browser-facing error
paths. It renders a styled HTML page using the AuthError template
instead of returning plain text. Technical error details stay in
server logs; the HTML page shows actionable messages derived from
the HTTP status code:
401/403 → "You are not authorized. Please contact your administrator."
410 → "Your session has expired. Please try again."
400-499 → "The request could not be processed. Please try again."
500+ → "Something went wrong. Please try again later."
Convert all httpError calls in oidc.go (OIDC callback, SSH check,
registration confirm) to httpUserError. Machine-facing endpoints
(noise, verify, key, health, debug) are unchanged.
Fixesjuanfont/headscale#3182
Add errorBox() and errorIcon() to the design system, mirroring the
existing successBox()/checkboxIcon() pattern with red error styling.
Extract error color constants from the inline values in statusMessage().
Add AuthError() template that renders a styled HTML error page using
the same HtmlStructure/mdTypesetBody/logo/footer as all other
browser-facing pages.
Updates juanfont/headscale#3182
Those were required to streamline new installs with updates before 0.27.
Since 0.29 will not allow direct upgrades from <0.27 to 0.29 we might as
well remove it.
The integration test generator scanned all files under integration/
with ripgrep, matching func Test* patterns in README.md code examples
(TestMyScenario, TestRouteAdvertisementBasic). Add --type go to limit
the search to Go source files.
Render an interstitial showing device hostname, OS, and machine-key
fingerprint before finalising OIDC registration. The user must POST
to /register/confirm/{auth_id} with a CSRF double-submit cookie.
Removes the TODO at oidc.go:201.
Build the statsviz Server directly and wrap its Index/Ws handlers in
tsweb.Protected instead of calling statsviz.Register on the raw mux
which bypasses AllowDebugAccess.
Add json:"-" to PostgresConfig.Pass, OIDCConfig.ClientSecret, and
CLIConfig.APIKey so they are excluded from json.Marshal output
(e.g. the /debug/config endpoint).
Add a row-level check so concurrent registrations with the same
single-use key cannot both succeed. Skip the call on
re-registration where the key is already marked used (#2830).
Reflection is a streaming RPC and bypasses the unary auth
interceptor on the remote (TCP) gRPC server. Remove it there;
the unix-socket server retains it for local debugging.
getCookieName sliced value[:6] unconditionally; a short state query
parameter caused a panic recovered by chi middleware. Reject states
shorter than cookieNamePrefixLen with 400.
The #2862 restart path returned nodeToRegisterResponse after a
NodeKey-only lookup without verifying MachineKey. Add the same
check handleLogout already performs.
SSHActionHandler now verifies that the Noise session's machine key
matches the dst node before proceeding. The (src, dst) pair is
captured at hold-and-delegate time via a new SSHCheckBinding on
AuthRequest so sshActionFollowUp can verify the follow-up URL
matches. The OIDC non-registration callback requires the
authenticated user to own the src node before approving.
Replace zcache with golang-lru/v2/expirable for both the state auth
cache and the OIDC state cache. Add tuning.register_cache_max_entries
(default 1024) to cap the number of pending registration entries.
Introduce types.RegistrationData to replace caching a full *Node;
only the fields the registration callback path reads are retained.
Remove the dead HSDatabase.regCache field. Drop zgo.at/zcache/v2
from go.mod.
When send() is called on a node with zero active connections
(disconnected but kept for rapid reconnection), it returns nil
(success). handleNodeChange then calls updateSentPeers, recording
peers as delivered when no client received the data.
This corrupts lastSentPeers: future computePeerDiff calculations
produce wrong results because they compare against phantom state.
After reconnection, the node's initial map resets lastSentPeers,
but any changes processed during the disconnect window leave
stale entries that cause asymmetric peer visibility.
Return errNoActiveConnections from send() when there are no
connections. handleNodeChange treats this as a no-op (the change
was generated but not deliverable) and skips updateSentPeers,
keeping lastSentPeers consistent with what clients actually
received.
When a FullUpdate produces zero visible peers (e.g., a restrictive
policy isolates a node), the MapResponse has Peers: [] (empty
non-nil). The Tailscale client only processes Peers as a full
replacement when len(Peers) > 0 (controlclient/map.go:462), so an
empty list is silently ignored and stale peers persist.
This triggers when a FullUpdate() replaces a pending PolicyChange()
in the batcher. The PolicyChange would have used computePeerDiff to
send explicit PeersRemoved, but the FullUpdate goes through
buildFromChange which sets Peers: [] that the client ignores.
When a full update produces zero peers, compute the peer diff
against lastSentPeers and add explicit PeersRemoved entries so the
client correctly clears its stale peer state.
Merging two changes targeted at different nodes is not supported
because the result can only carry one TargetNode. The second
target's content would be silently misrouted.
Add a panic guard that catches this at the Merge call site rather
than allowing silent data loss. In production, Merge is only called
with broadcast changes (TargetNode=0) so the guard acts as
insurance against future misuse.
url.JoinPath resolves path-traversal segments like '..' and '.',
which silently drops the OIDC subject from the identifier. For
example, Iss='https://example.com' with Sub='..' produces
'https://example.com' — the subject is lost entirely. This causes
distinct OIDC users to receive colliding identifiers.
Replace url.JoinPath with simple string concatenation using a slash
separator. This preserves the subject literally regardless of its
content. url.PathEscape does not help because dots are valid URL
path characters and are not escaped.
Owner() on a non-tagged node with nil User returns an invalid
UserView that panics when Name() is called. Add a guard to return
an empty UserView{} when the user is not valid.
TailscaleUserID() calls UserID().Get() without checking Valid()
first, which panics on orphaned nodes (no tags, no UserID). Add a
validity check to return 0 for this invalid state.
Callers should check Owner().Valid() before accessing fields.
The migration at db.go:680 appends validated tags to existing tags
using append(existingTags, validatedTags...) where existingTags
aliases node.Tags. When node.Tags has spare capacity, append writes
into the shared backing array, and the subsequent slices.Sort
corrupts the original.
Clone existingTags before appending to prevent aliasing.
routesChanged aliases newHI.RoutableIPs into a local variable then
sorts it in place, which mutates the caller's Hostinfo data. The
Hostinfo is subsequently stored on the node, so the mutation
propagates but the input contract is violated.
Clone the slice before sorting to avoid mutating the input.
Merge copies the receiver by value, but the slice headers share the
backing array with the original. When append has spare capacity, it
writes through to the original's memory, and uniqueNodeIDs then
sorts that shared data in place.
Replace append with slices.Concat which always allocates a fresh
backing array, preventing mutation of the receiver's slices.
Procedural content moves to cmd/hi/README.md and integration/README.md.
Stale references (poll.go:420, mapper/tail.go, notifier/,
quality-control-enforcer, validateAndNormalizeTags) are corrected or
removed.
The nix dev shell refresh in 758fef9b pulled in protoc-gen-go-grpc
v1.6.1 and newer tailscale.com/cmd/{viewer,cloner}, so rerunning
`make generate` updates the version header comments in the three
affected generated files. No semantic changes.
Refresh flake.lock (nixpkgs 2026-03-08 -> 2026-04-09) and bump the
tool pins that live directly in flake.nix:
* golangci-lint 2.9.0 -> 2.11.4
* protoc-gen-grpc-gateway 2.27.7 -> 2.28.0 (keeps the dev-shell
code-gen tool in sync with the grpc-gateway Go module)
* protobuf-language-server pinned commit bumped to ab4c128
Also replace nodePackages.prettier with the top-level prettier
attribute. nodePackages was removed from nixpkgs in the update and
the dev shell would otherwise fail to evaluate with:
error: nodePackages has been removed because it was unmaintainable
within nixpkgs
`nix flake check --all-systems` and `nix build .#headscale` both
pass, and `golangci-lint 2.11.4` reports no new issues on the tree.
Routine bump of direct Go dependencies. Notable updates:
* tailscale.com v1.94.1 -> v1.96.5 (gvisor bumped in lockstep to
match upstream tailscale go.mod)
* modernc.org/sqlite v1.44.3 -> v1.48.2, modernc.org/libc v1.67.6
-> v1.70.0 (updated together as required by the fragile libc
dependency noted in #2188)
* google.golang.org/grpc v1.78.0 -> v1.80.0
* grpc-ecosystem/grpc-gateway/v2 v2.27.7 -> v2.28.0
* tailscale/hujson, tailscale/squibble, tailscale/tailsql
* golang.org/x/{crypto,net,sync,oauth2,exp,sys,text,time,term,mod,tools}
* rs/zerolog, samber/lo, sasha-s/go-deadlock, coreos/go-oidc/v3,
creachadair/command, go-json-experiment/json, pterm/pterm
Update the nix vendorHash to match the new go.sum. Regenerating capver
against tailscale v1.96.5 produces no diff: v1.96.0 was already
captured in 442fcdbd and the capability version has not changed in
the patch series.
All unit tests and `golangci-lint run --new-from-rev=main` are clean.
Tailscale main now requires go >= 1.26.2, so building the HEAD derper
image against golang:1.26.1-alpine fails with:
go: go.mod requires go >= 1.26.2 (running go 1.26.1; GOTOOLCHAIN=local)
Bump Dockerfile.derper to match the earlier fix for Dockerfile.tailscale-HEAD
in 6390fcee so TestDERPVerifyEndpoint can build the derper container
again. This test is the only consumer of Dockerfile.derper, which is why
the failure was scoped to that single integration job.
DestroyUser called ListPreAuthKeys(tx) which returns ALL pre-auth keys
across all users, then deleted every one of them. This caused deleting
any single user to wipe out pre-auth keys for every other user.
Extract a ListPreAuthKeysByUser function (consistent with the existing
ListNodesByUser pattern) and use it in DestroyUser to scope key deletion
to the user being destroyed.
Add unit test (table-driven in TestDestroyUserErrors) and integration
test to prevent regression.
Fixes#3154
Co-authored-by: Kristoffer Dalby <kristoffer@dalby.cc>
When UpdateNodeFromMapRequest and SetNodeTags race on persistNodeToDB,
the first caller to run updatePolicyManagerNodes detects the tag change
and returns a PolicyChange. The second caller finds no change and falls
back to NodeAdded.
If UpdateNodeFromMapRequest wins the race, it checked
policyChange.IsFull() which is always false for PolicyChange (only sets
IncludePolicy and RequiresRuntimePeerComputation). This caused the
PolicyChange to be dropped, so affected clients never received
PeersRemoved and the stale peer remained in their NetMap indefinitely.
Fix: check !policyChange.IsEmpty() instead, which correctly detects
any non-trivial policy change including PolicyChange().
This fixes the root cause of TestACLTagPropagation/multiple-tags-partial-
removal flaking at ~20% on CI.
Updates #3125
GORM's struct-based Updates() silently skips nil pointer fields.
When SetNodeTags sets node.UserID = nil to transfer ownership to tags,
the in-memory NodeStore is correct but the database retains the old
user_id value. This causes tagged nodes to remain associated with the
original user in the database, preventing user deletion and risking
ON DELETE CASCADE destroying tagged nodes.
Add Select("*") before Omit() on all three node persistence paths
to force GORM to include all fields in the UPDATE statement, including
nil pointers. This is the same pattern already used in db/ip.go for
IPv4/IPv6 nil handling, and is documented GORM behavior:
db.Select("*").Omit("excluded").Updates(struct)
The three affected paths are:
- persistNodeToDB: used by SetNodeTags and MapRequest updates
- applyAuthNodeUpdate: used by re-authentication with --advertise-tags
- HandleNodeFromPreAuthKey: used by PAK re-registration
Fixes#3161
Add four tests that verify the tags-as-identity ownership transition
correctly persists to the database when converting a user-owned node
to a tagged node via SetTags:
- TestSetTags_ClearsUserIDInDatabase: verifies user_id is NULL in DB
- TestSetTags_NodeDisappearsFromUserListing: verifies ListNodes by user
- TestSetTags_NodeStoreAndDBConsistency: verifies in-memory and DB agree
- TestSetTags_UserDeletionDoesNotCascadeToTaggedNode: verifies user
deletion does not cascade-delete tagged nodes
Three of these tests currently fail because GORM's struct-based
Updates() silently skips nil pointer fields, so user_id is never
written as NULL to the database after SetNodeTags clears it in memory.
Updates #3161