
Research
/Security News
Critical Vulnerability in NestJS Devtools: Localhost RCE via Sandbox Escape
A flawed sandbox in @nestjs/devtools-integration lets attackers run code on your machine via CSRF, leading to full Remote Code Execution (RCE).
High-performance Rust-based load balancer for SGLang with multiple routing algorithms and prefill-decode disaggregation support
SGLang router is a standalone Rust module that enables data parallelism across SGLang instances, providing high-performance request routing and advanced load balancing. The router supports multiple load balancing algorithms including cache-aware, power of two, random, and round robin, and acts as a specialized load balancer for prefill-decode disaggregated serving architectures.
Rust and Cargo:
# Install rustup (Rust installer and version manager)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Follow the installation prompts, then reload your shell
source $HOME/.cargo/env
# Verify installation
rustc --version
cargo --version
Python with pip installed
# Install build dependencies
pip install setuptools-rust wheel build
# Build the wheel package
python -m build
# Install the generated wheel
pip install dist/*.whl
# One-liner for development (rebuild + install)
python -m build && pip install --force-reinstall dist/*.whl
pip install -e .
⚠️ Warning: Editable installs may suffer performance degradation. Use wheel builds for performance testing.
# Build Rust components
cargo build
# Launch router with worker URLs
python -m sglang_router.launch_router \
--worker-urls http://worker1:8000 http://worker2:8000
# Note that the prefill and decode URLs must be provided in the following format:
# http://<ip>:<port> for decode nodes
# http://<ip>:<port> bootstrap-port for prefill nodes, where bootstrap-port is optional
# Launch router with worker URLs
python -m sglang_router.launch_router \
--pd-disaggregation \
--policy cache_aware \
--prefill http://127.0.0.1:30001 9001 \
--prefill http://127.0.0.2:30002 9002 \
--prefill http://127.0.0.3:30003 9003 \
--prefill http://127.0.0.4:30004 9004 \
--decode http://127.0.0.5:30005 \
--decode http://127.0.0.6:30006 \
--decode http://127.0.0.7:30007 \
--host 0.0.0.0 \
--port 8080
Enable structured logging with optional file output:
from sglang_router import Router
# Console logging (default)
router = Router(worker_urls=["http://worker1:8000", "http://worker2:8000"])
# File logging enabled
router = Router(
worker_urls=["http://worker1:8000", "http://worker2:8000"],
log_dir="./logs" # Daily log files created here
)
Set log level with --log-level
flag (documentation).
Prometheus metrics endpoint available at 127.0.0.1:29000
by default.
# Custom metrics configuration
python -m sglang_router.launch_router \
--worker-urls http://localhost:8080 http://localhost:8081 \
--prometheus-host 0.0.0.0 \
--prometheus-port 9000
Track requests across distributed systems with configurable headers:
# Use custom request ID headers
python -m sglang_router.launch_router \
--worker-urls http://localhost:8080 \
--request-id-headers x-trace-id x-request-id
Default headers: x-request-id
, x-correlation-id
, x-trace-id
, request-id
Automatic worker discovery and management in Kubernetes environments.
python -m sglang_router.launch_router \
--service-discovery \
--selector app=sglang-worker role=inference \
--service-discovery-namespace default
For disaggregated prefill/decode routing:
python -m sglang_router.launch_router \
--pd-disaggregation \
--policy cache_aware \
--service-discovery \
--prefill-selector app=sglang component=prefill \
--decode-selector app=sglang component=decode \
--service-discovery-namespace sglang-system
# With separate routing policies:
python -m sglang_router.launch_router \
--pd-disaggregation \
--prefill-policy cache_aware \
--decode-policy power_of_two \
--service-discovery \
--prefill-selector app=sglang component=prefill \
--decode-selector app=sglang component=decode \
--service-discovery-namespace sglang-system
Prefill Server Pod:
apiVersion: v1
kind: Pod
metadata:
name: sglang-prefill-1
labels:
app: sglang
component: prefill
annotations:
sglang.ai/bootstrap-port: "9001" # Optional: Bootstrap port
spec:
containers:
- name: sglang
image: lmsys/sglang:latest
ports:
- containerPort: 8000 # Main API port
- containerPort: 9001 # Optional: Bootstrap port
Decode Server Pod:
apiVersion: v1
kind: Pod
metadata:
name: sglang-decode-1
labels:
app: sglang
component: decode
spec:
containers:
- name: sglang
image: lmsys/sglang:latest
ports:
- containerPort: 8000
Namespace-scoped (recommended):
apiVersion: v1
kind: ServiceAccount
metadata:
name: sglang-router
namespace: sglang-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: sglang-system
name: sglang-router
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: sglang-router
namespace: sglang-system
subjects:
- kind: ServiceAccount
name: sglang-router
namespace: sglang-system
roleRef:
kind: Role
name: sglang-router
apiGroup: rbac.authorization.k8s.io
python -m sglang_router.launch_router \
--pd-disaggregation \
--policy cache_aware \
--service-discovery \
--prefill-selector app=sglang component=prefill environment=production \
--decode-selector app=sglang component=decode environment=production \
--service-discovery-namespace production \
--host 0.0.0.0 \
--port 8080 \
--prometheus-host 0.0.0.0 \
--prometheus-port 9090
--service-discovery
: Enable Kubernetes service discovery--service-discovery-port
: Port for worker URLs (default: 8000)--service-discovery-namespace
: Kubernetes namespace to watch--selector
: Label selectors for regular mode (format: key1=value1 key2=value2
)--pd-disaggregation
: Enable Prefill-Decode disaggregated mode--prefill
: Initial prefill server (format: URL BOOTSTRAP_PORT
)--decode
: Initial decode server URL--prefill-selector
: Label selector for prefill pods--decode-selector
: Label selector for decode pods--policy
: Routing policy (cache_aware
, random
, power_of_two
, round_robin
)--prefill-policy
: Separate routing policy for prefill nodes (optional, overrides --policy
for prefill)--decode-policy
: Separate routing policy for decode nodes (optional, overrides --policy
for decode)# Build Rust project
cargo build
# Build Python binding (see Installation section above)
Note: When modifying Rust code, you must rebuild the wheel for changes to take effect.
VSCode Rust Analyzer Issues:
Set rust-analyzer.linkedProjects
to the absolute path of Cargo.toml
:
{
"rust-analyzer.linkedProjects": ["/workspaces/sglang/sgl-router/Cargo.toml"]
}
The continuous integration pipeline includes comprehensive testing, benchmarking, and publishing:
cibuildwheel
for manylinux x86_64 packagespyproject.toml
/docker/Dockerfile.router
FAQs
High-performance Rust-based load balancer for SGLang with multiple routing algorithms and prefill-decode disaggregation support
We found that sglang-router demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
A flawed sandbox in @nestjs/devtools-integration lets attackers run code on your machine via CSRF, leading to full Remote Code Execution (RCE).
Product
Customize license detection with Socket’s new license overlays: gain control, reduce noise, and handle edge cases with precision.
Product
Socket now supports Rust and Cargo, offering package search for all users and experimental SBOM generation for enterprise projects.