Documentation
Fix Agent 404 And UUID Change
修复 Agent 404 错误和用户 UUID 变更
修复 Agent 404 错误和用户 UUID 变更
日期: 2026-02-05
负责人: SRE Team
审核人: DevOps Lead
最后更新: 2026-02-05T15:28:00+08:00
问题描述
1. Agent 通信 404 错误
- 现象: Agent 服务在向
accounts-svc-plus报告状态时收到 404 错误 - 影响范围: 所有 agent 节点无法正常上报心跳和配置同步
- 错误日志:
Feb 05 07:24:23 hk-xhttp.svc.plus agent-svc-plus[107285]: {"time":"2026-02-05T07:24:23.907002669Z","level":"ERROR","msg":"xray config sync failed", "component":"agent-xray-sync","target":"tcp", "err":"list clients: controller returned 404 Not Found: 404 page not found"} POST 404 https://accounts-svc-plus-266500572462.asia-northeast1.run.app/api/agent-server/v1/status GET 404 https://accounts-svc-plus-266500572462.asia-northeast1.run.app/api/agent-server/v1/users
2. 用户 UUID 变更需求
- 用户: tester123@example.com
- 原 UUID:
4b66928e-a81e-4981-bae0-289ddb92439c - 新 UUID:
18d270a9-533d-4b13-b3f1-e7f55540a9b2 - 原因: 业务需求,需要将用户 ID 更改为指定值
3. Agent 节点数据显示问题
- 现象:
/panel/agent页面显示 "Loading control center..." - 影响: 用户无法查看运行节点状态
根本原因分析
Agent 404 错误的根本原因
代码已正确实现:
accounts.svc.plus/cmd/accountsvc/main.go第 1061-1070 行已注册/api/agent-server/v1/*路由- 包括
GET /api/agent-server/v1/users和POST /api/agent-server/v1/status
生产环境未部署最新代码:
- Cloud Run 服务
accounts-svc-plus运行的是旧版本代码 - 旧版本不包含 agent API 路由
- 测试确认:
curl https://accounts-svc-plus-266500572462.asia-northeast1.run.app/api/agent-server/v1/users返回404 page not found
- Cloud Run 服务
Agent 配置正确:
- Agent 配置文件:
/etc/agent/account-agent.yaml - Controller URL:
https://accounts-svc-plus-266500572462.asia-northeast1.run.app - API Token: 正确配置(与
INTERNAL_SERVICE_TOKEN匹配)
- Agent 配置文件:
诊断步骤
1. 检查 Agent 日志
# 在 agent 节点上查看日志
ssh root@hk-xhttp.svc.plus
journalctl -u agent-svc-plus -n 50 --no-pager
# 发现错误
# "err":"list clients: controller returned 404 Not Found: 404 page not found"
2. 检查 Agent 配置
# 查看 agent 配置
ssh root@hk-xhttp.svc.plus "cat /etc/agent/account-agent.yaml"
# 确认 controller URL 和 token 配置正确
# controllerUrl: "https://accounts-svc-plus-266500572462.asia-northeast1.run.app"
# apiToken: "uTvryFvAbz6M5sRtmTaSTQY6otLZ95hneBsWqXu+35I="
3. 测试 API 端点
# 测试 /api/agent-server/v1/users 端点
curl -s -H "Authorization: Bearer uTvryFvAbz6M5sRtmTaSTQY6otLZ95hneBsWqXu+35I=" \
"https://accounts-svc-plus-266500572462.asia-northeast1.run.app/api/agent-server/v1/users"
# 返回: 404 page not found
# 确认生产环境缺少该路由
4. 检查代码实现
# 检查路由注册代码
grep -n "registerAgentAPIRoutes" accounts.svc.plus/cmd/accountsvc/main.go
# 第 852 行: registerAgentAPIRoutes(r, agentRegistry, gormSource, logger)
# 第 1061 行: func registerAgentAPIRoutes(...)
# 确认代码中已正确实现
5. 检查数据库约束(UUID 变更)
# 连接到 PostgreSQL
ssh -i ~/.ssh/id_rsa root@postgresql.svc.plus
# 查看外键约束
docker exec postgresql-svc-plus psql -U postgres -d account -c "
SELECT conname, conrelid::regclass
FROM pg_constraint
WHERE confrelid = 'public.users'::regclass;
"
# 结果显示:
# - identities_user_uuid_fkey
# - sessions_user_uuid_fkey
# - subscriptions_user_uuid_fkey
修复方案
修复 1: 部署最新代码到 Cloud Run ⚠️ 关键修复
问题: 生产环境的 Cloud Run 服务运行的是旧版本代码,缺少 agent API 路由
解决方案: 重新构建和部署 accounts-svc-plus 服务
# 1. 进入项目目录
cd /Users/shenlan/workspaces/cloud-neutral-toolkit/accounts.svc.plus
# 2. 设置 GCP 项目
export GCP_PROJECT=xzerolab-480008
# 3. 构建并推送 Docker 镜像
make cloudrun-build
# 4. 部署到 Cloud Run
make cloudrun-deploy
# 或者使用 gcloud 命令直接部署
gcloud run deploy accounts-svc-plus \
--source . \
--project=xzerolab-480008 \
--region=asia-northeast1 \
--platform=managed \
--allow-unauthenticated
预期结果:
- Cloud Run 服务更新为最新版本
/api/agent-server/v1/users和/api/agent-server/v1/status端点可用- Agent 能够成功同步配置
修复 2: 添加前端 Agent Server 代理路由
文件: console.svc.plus/src/app/api/agent-server/[...segments]/route.ts
export const dynamic = 'force-dynamic'
import type { NextRequest } from 'next/server'
import { createUpstreamProxyHandler } from '@lib/apiProxy'
import { getAccountServiceBaseUrl } from '@server/serviceConfig'
const AGENT_SERVER_PREFIX = '/api/agent-server'
function createHandler() {
const upstreamBaseUrl = getAccountServiceBaseUrl()
return createUpstreamProxyHandler({
upstreamBaseUrl,
upstreamPathPrefix: AGENT_SERVER_PREFIX,
})
}
const handler = createHandler()
export function GET(request: NextRequest) {
return handler(request)
}
export function POST(request: NextRequest) {
return handler(request)
}
export function PUT(request: NextRequest) {
return handler(request)
}
export function PATCH(request: NextRequest) {
return handler(request)
}
export function DELETE(request: NextRequest) {
return handler(request)
}
export function HEAD(request: NextRequest) {
return handler(request)
}
export function OPTIONS(request: NextRequest) {
return handler(request)
}
说明:
- 创建代理路由将前端的
/api/agent-server/*请求转发到accounts-svc-plus - 这个路由主要用于前端调试,agent 服务直接调用 Cloud Run URL
修复 3: 增强 Registry 持久化和日志
文件: accounts.svc.plus/internal/agentserver/registry.go
变更:
- 添加
logger *slog.Logger字段到Registry结构体 - 添加
SetLogger()方法 - 在
RegisterAgent()和ReportStatus()中添加错误日志
关键代码:
// 在 ReportStatus 中添加日志
if err := r.store.UpsertAgent(ctx, dbAgent); err != nil {
r.logger.Error("failed to persist agent status heartbeat", "agent", a.ID, "err", err)
}
// 在 RegisterAgent 中添加日志
if err := r.store.UpsertAgent(ctx, dbAgent); err != nil {
r.logger.Error("failed to persist dynamically registered agent", "agent", id, "err", err)
}
文件: accounts.svc.plus/cmd/accountsvc/main.go
if agentRegistry != nil {
agentRegistry.SetStore(st)
agentRegistry.SetLogger(logger.With("component", "agent-registry"))
// ... 其余代码
}
修复 4: 用户 UUID 变更
连接数据库:
ssh -i ~/.ssh/id_rsa root@postgresql.svc.plus
执行 SQL 事务:
BEGIN;
-- 1. 重命名旧用户(避免唯一约束冲突)
UPDATE users
SET username = username || '_old',
email = email || '_old'
WHERE uuid = '4b66928e-a81e-4981-bae0-289ddb92439c';
-- 2. 创建新用户记录(使用新 UUID)
INSERT INTO users (
uuid, username, password, email, role, level, groups, permissions,
created_at, updated_at, version, origin_node, mfa_totp_secret,
mfa_enabled, mfa_secret_issued_at, mfa_confirmed_at, email_verified_at
)
SELECT
'18d270a9-533d-4b13-b3f1-e7f55540a9b2',
REPLACE(username, '_old', ''),
password,
REPLACE(email, '_old', ''),
role, level, groups, permissions,
created_at, updated_at, version, origin_node, mfa_totp_secret,
mfa_enabled, mfa_secret_issued_at, mfa_confirmed_at, email_verified_at
FROM users
WHERE uuid = '4b66928e-a81e-4981-bae0-289ddb92439c';
-- 3. 更新所有外键引用
UPDATE identities
SET user_uuid = '18d270a9-533d-4b13-b3f1-e7f55540a9b2'
WHERE user_uuid = '4b66928e-a81e-4981-bae0-289ddb92439c';
UPDATE sessions
SET user_uuid = '18d270a9-533d-4b13-b3f1-e7f55540a9b2'
WHERE user_uuid = '4b66928e-a81e-4981-bae0-289ddb92439c';
UPDATE subscriptions
SET user_uuid = '18d270a9-533d-4b13-b3f1-e7f55540a9b2'
WHERE user_uuid = '4b66928e-a81e-4981-bae0-289ddb92439c';
-- 4. 删除旧用户记录
DELETE FROM users
WHERE uuid = '4b66928e-a81e-4981-bae0-289ddb92439c';
COMMIT;
执行命令:
docker exec postgresql-svc-plus psql -U postgres -d account -c "
BEGIN;
UPDATE users SET username = username || '_old', email = email || '_old' WHERE uuid = '4b66928e-a81e-4981-bae0-289ddb92439c';
INSERT INTO users (uuid, username, password, email, role, level, groups, permissions, created_at, updated_at, version, origin_node, mfa_totp_secret, mfa_enabled, mfa_secret_issued_at, mfa_confirmed_at, email_verified_at)
SELECT '18d270a9-533d-4b13-b3f1-e7f55540a9b2', REPLACE(username, '_old', ''), password, REPLACE(email, '_old', ''), role, level, groups, permissions, created_at, updated_at, version, origin_node, mfa_totp_secret, mfa_enabled, mfa_secret_issued_at, mfa_confirmed_at, email_verified_at
FROM users WHERE uuid = '4b66928e-a81e-4981-bae0-289ddb92439c';
UPDATE identities SET user_uuid = '18d270a9-533d-4b13-b3f1-e7f55540a9b2' WHERE user_uuid = '4b66928e-a81e-4981-bae0-289ddb92439c';
UPDATE sessions SET user_uuid = '18d270a9-533d-4b13-b3f1-e7f55540a9b2' WHERE user_uuid = '4b66928e-a81e-4981-bae0-289ddb92439c';
UPDATE subscriptions SET user_uuid = '18d270a9-533d-4b13-b3f1-e7f55540a9b2' WHERE user_uuid = '4b66928e-a81e-4981-bae0-289ddb92439c';
DELETE FROM users WHERE uuid = '4b66928e-a81e-4981-bae0-289ddb92439c';
COMMIT;
"
状态: ✅ 已完成
修复 5: 改进前端错误处理
文件: console.svc.plus/src/modules/extensions/builtin/user-center/routes/agent.tsx
变更:
- 改进
fetcher函数的错误处理 - 添加错误消息显示
async function fetcher(url: string): Promise<VlessNode[]> {
const res = await fetch(url, { credentials: 'include', cache: 'no-store' })
const payload = await res.json().catch(() => null)
if (!res.ok) {
const message =
(payload && typeof payload.message === 'string' && payload.message) ||
(payload && typeof payload.error === 'string' && payload.error) ||
`Request failed (${res.status})`
throw new Error(message)
}
if (Array.isArray(payload)) {
return payload as VlessNode[]
}
if (payload && Array.isArray((payload as { nodes?: unknown }).nodes)) {
return (payload as { nodes: VlessNode[] }).nodes
}
return []
}
// 在 UI 中显示错误
{error && (
<div className="rounded-xl border border-[color:var(--color-danger-border)] bg-[var(--color-danger-muted)]/30 px-4 py-3 text-sm text-[var(--color-danger-foreground)]">
{language === 'zh'
? `节点列表加载失败:${error.message}`
: `Failed to load agent nodes: ${error.message}`}
</div>
)}
状态: ✅ 已完成
验证方法
1. 验证 Cloud Run 部署 ⚠️ 关键验证
# 测试 agent API 端点
curl -s -H "Authorization: Bearer uTvryFvAbz6M5sRtmTaSTQY6otLZ95hneBsWqXu+35I=" \
"https://accounts-svc-plus-266500572462.asia-northeast1.run.app/api/agent-server/v1/users"
# 预期结果: JSON 响应包含用户列表
# {
# "clients": [...],
# "total": N,
# "generated_at": "2026-02-05T07:30:00Z"
# }
# 如果仍返回 404,说明部署未成功
2. 验证 Agent 同步
# 在 agent 节点上查看日志
ssh root@hk-xhttp.svc.plus
journalctl -u agent-svc-plus -f
# 预期看到:
# - "xray config synced successfully"
# - 没有 404 错误
3. 验证 UUID 变更
# 查询新 UUID
docker exec postgresql-svc-plus psql -U postgres -d account -c "
SELECT uuid, username, email
FROM users
WHERE email = 'tester123@example.com';
"
# 预期结果:
# uuid | username | email
# --------------------------------------+-----------+-----------------------
# 18d270a9-533d-4b13-b3f1-e7f55540a9b2 | tester123 | tester123@example.com
4. 验证关联数据
# 检查订阅是否正确关联
docker exec postgresql-svc-plus psql -U postgres -d account -c "
SELECT user_uuid, external_id, status
FROM subscriptions
WHERE user_uuid = '18d270a9-533d-4b13-b3f1-e7f55540a9b2';
"
5. 验证前端显示
# 访问 https://www.svc.plus/panel/agent
# 确认页面能够正常加载
# 如果有 401 错误,检查认证 token 传递
部署步骤
步骤 1: 部署 accounts-svc-plus 到 Cloud Run
# 1. 进入项目目录
cd /Users/shenlan/workspaces/cloud-neutral-toolkit/accounts.svc.plus
# 2. 确认代码已提交
git status
git add .
git commit -m "feat: add agent API routes for /api/agent-server/v1"
git push
# 3. 设置环境变量
export GCP_PROJECT=xzerolab-480008
export GCP_REGION=asia-northeast1
# 4. 构建镜像(如果使用 Makefile)
make cloudrun-build
# 5. 更新 service.yaml 以使用 Secret Manager
# 确保 service.yaml 中 INTERNAL_SERVICE_TOKEN 使用 valueFrom: secretKeyRef 配置
# 6. 部署服务
make cloudrun-deploy
# 或者使用 gcloud 命令
gcloud run deploy accounts-svc-plus \
--source . \
--project=$GCP_PROJECT \
--region=$GCP_REGION \
--platform=managed \
--allow-unauthenticated
# 6. 等待部署完成
# 预期输出: Service [accounts-svc-plus] revision [accounts-svc-plus-xxxxx] has been deployed
步骤 2: 验证部署
# 测试 API 端点
curl -s -H "Authorization: Bearer uTvryFvAbz6M5sRtmTaSTQY6otLZ95hneBsWqXu+35I=" \
"https://accounts-svc-plus-266500572462.asia-northeast1.run.app/api/agent-server/v1/users"
# 应该返回 JSON 而不是 404
步骤 3: 监控 Agent 日志
# 在 agent 节点上监控日志
ssh root@hk-xhttp.svc.plus
journalctl -u agent-svc-plus -f
# 等待下一次同步周期(5分钟)
# 确认没有 404 错误
回滚计划
如果 Cloud Run 部署导致问题
# 1. 查看之前的版本
gcloud run revisions list \
--service=accounts-svc-plus \
--project=xzerolab-480008 \
--region=asia-northeast1
# 2. 回滚到之前的版本
gcloud run services update-traffic accounts-svc-plus \
--to-revisions=PREVIOUS_REVISION=100 \
--project=xzerolab-480008 \
--region=asia-northeast1
如果 UUID 变更导致问题
-- 反向操作(需要提前备份数据)
BEGIN;
-- 重命名当前用户
UPDATE users
SET username = username || '_new',
email = email || '_new'
WHERE uuid = '18d270a9-533d-4b13-b3f1-e7f55540a9b2';
-- 恢复旧 UUID
INSERT INTO users (uuid, username, password, email, ...)
SELECT '4b66928e-a81e-4981-bae0-289ddb92439c',
REPLACE(username, '_new', ''),
...
FROM users
WHERE uuid = '18d270a9-533d-4b13-b3f1-e7f55540a9b2';
-- 更新外键
UPDATE identities SET user_uuid = '4b66928e-a81e-4981-bae0-289ddb92439c'
WHERE user_uuid = '18d270a9-533d-4b13-b3f1-e7f55540a9b2';
UPDATE sessions SET user_uuid = '4b66928e-a81e-4981-bae0-289ddb92439c'
WHERE user_uuid = '18d270a9-533d-4b13-b3f1-e7f55540a9b2';
UPDATE subscriptions SET user_uuid = '4b66928e-a81e-4981-bae0-289ddb92439c'
WHERE user_uuid = '18d270a9-533d-4b13-b3f1-e7f55540a9b2';
-- 删除新记录
DELETE FROM users WHERE uuid = '18d270a9-533d-4b13-b3f1-e7f55540a9b2';
COMMIT;
已知问题
1. /api/agent/nodes 返回 401 错误
- 现象: 前端访问
/api/agent/nodes时收到 401 Unauthorized - 原因: 认证 token 未正确传递到该端点
- 影响: 用户无法查看节点列表
- 状态: 待修复
- 临时方案: 直接访问后端 API 或使用 admin 账户
2. Agent API 路由未部署到生产环境 ⚠️ 阻塞问题
- 现象: Cloud Run 服务返回 404
- 原因: 生产环境运行旧版本代码
- 影响: Agent 无法同步配置
- 状态: 需要立即部署
- 修复: 执行
make cloudrun-deploy
相关文档
附录
Agent 配置示例
文件: /etc/agent/account-agent.yaml
mode: "agent"
log:
level: info
agent:
id: "hk-proxy-server"
controllerUrl: "https://accounts-svc-plus-266500572462.asia-northeast1.run.app"
apiToken: "uTvryFvAbz6M5sRtmTaSTQY6otLZ95hneBsWqXu+35I="
httpTimeout: 15s
statusInterval: 1m
syncInterval: 5m
tls:
insecureSkipVerify: false
xray:
sync:
enabled: true
interval: 5m
targets:
- name: "xhttp"
outputPath: "/usr/local/etc/xray/config.json"
templatePath: "/usr/local/etc/xray/templates/xray.xhttp.template.json"
restartCommand:
- "systemctl"
- "restart"
- "xray.service"
- name: "tcp"
outputPath: "/usr/local/etc/xray/tcp-config.json"
templatePath: "/usr/local/etc/xray/templates/xray.tcp.template.json"
restartCommand:
- "systemctl"
- "restart"
- "xray-tcp.service"
数据库连接信息
# SSH 连接
ssh -i ~/.ssh/id_rsa root@postgresql.svc.plus
# Docker 容器名称
postgresql-svc-plus
# 数据库名称
account
# 用户名
postgres
# 密码
见 .env 文件
相关服务
- accounts-svc-plus: Cloud Run 服务,处理认证和用户管理
- URL:
https://accounts-svc-plus-266500572462.asia-northeast1.run.app - 域名:
https://accounts.svc.plus
- URL:
- console.svc.plus: 前端控制台
- URL:
https://www.svc.plus
- URL:
- agent.svc.plus: Agent 服务节点
- 节点:
hk-xhttp.svc.plus,jp-xhttp.svc.plus,us-xhttp.svc.plus
- 节点:
监控和日志
# 查看 Cloud Run 日志
gcloud run services logs read accounts-svc-plus \
--project=xzerolab-480008 \
--region=asia-northeast1 \
--limit=100
# 查看 Agent 日志
ssh root@hk-xhttp.svc.plus "journalctl -u agent-svc-plus -n 100 --no-pager"
# 查看数据库日志
ssh -i ~/.ssh/id_rsa root@postgresql.svc.plus \
"docker logs postgresql-svc-plus --tail=100"
关键 API 端点
# Agent API 端点(需要 Bearer token)
GET /api/agent-server/v1/users # 获取用户列表
POST /api/agent-server/v1/status # 上报 agent 状态
# 用户 API 端点(需要用户认证)
GET /api/agent/nodes # 获取 agent 节点列表
# 认证端点
GET /api/auth/session # 获取当前会话
POST /api/auth/login # 用户登录
故障排查清单
- 检查 Cloud Run 服务是否运行最新版本
- 验证 agent API 端点返回 200 而不是 404
- 检查 agent 配置文件中的 controller URL 和 token
- 查看 agent 日志确认没有 404 错误
- 验证数据库中的 UUID 已正确更新
- 检查所有外键引用是否指向新 UUID
- 测试前端页面是否能正常加载
- 监控 Cloud Run 日志确认没有新错误
Feedback