API Design¶
This document explains clinvoker's API architecture, including the REST API design principles, OpenAI and Anthropic compatibility layers, endpoint routing, middleware integration, and request/response transformation.
API Architecture Overview¶
clinvoker exposes three API surfaces:
flowchart TB
subgraph API_Layers["API Layers"]
CUSTOM[Native API
/api/v1/*]
OPENAI[OpenAI Compatible
/openai/v1/*]
ANTH[Anthropic Compatible
/anthropic/v1/*]
end
subgraph Core["Core Services"]
EXEC[Executor]
SESSION[Session Manager]
BACKEND[Backend Registry]
end
CUSTOM --> EXEC
OPENAI --> EXEC
ANTH --> EXEC
EXEC --> SESSION
EXEC --> BACKEND
REST API Design Principles¶
Resource-Oriented Design¶
The native API follows REST principles with resource-oriented URLs:
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/v1/health |
Health check |
| POST | /api/v1/prompt |
Submit a prompt |
| GET | /api/v1/sessions |
List sessions |
| GET | /api/v1/sessions/{id} |
Get session details |
| POST | /api/v1/sessions/{id}/resume |
Resume a session |
| DELETE | /api/v1/sessions/{id} |
Delete a session |
HTTP Status Codes¶
| Status | Meaning |
|---|---|
| 200 OK | Successful GET/PUT/DELETE |
| 201 Created | Resource created |
| 400 Bad Request | Invalid request body/params |
| 401 Unauthorized | Missing/invalid API key |
| 429 Too Many Requests | Rate limit exceeded |
| 500 Internal Server Error | Server error |
Response Format¶
All responses follow a consistent envelope:
OpenAI Compatibility Layer¶
The OpenAI-compatible API (/openai/v1/*) enables drop-in replacement for OpenAI SDK clients.
Endpoint Mapping¶
| OpenAI Endpoint | clinvoker Handler |
|---|---|
POST /v1/chat/completions |
POST /openai/v1/chat/completions |
GET /v1/models |
GET /openai/v1/models |
Note: GET /v1/models/{model} is not implemented.
Request Transformation¶
// OpenAI request format
{
"model": "gpt-4",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"stream": false
}
// Transformed to clinvoker internal format
{
"backend": "claude",
"prompt": "You are a helpful assistant.\n\nHello!",
"options": {
"model": "sonnet"
}
}
Response Transformation¶
// clinvoker internal response
{
"content": "Hello! How can I help you today?",
"session_id": "sess-abc123",
"usage": {
"input_tokens": 25,
"output_tokens": 10
}
}
// OpenAI response format
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1705317600,
"model": "gpt-4",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 10,
"total_tokens": 35
}
}
Streaming Support¶
OpenAI-compatible streaming uses Server-Sent Events (SSE):
data: {"id":"chatcmpl-123","choices":[{"delta":{"content":"Hello"}}]}
data: {"id":"chatcmpl-123","choices":[{"delta":{"content":"!"}}]}
data: {"id":"chatcmpl-123","choices":[{"delta":{},"finish_reason":"stop"}]}
data: [DONE]
Anthropic Compatibility Layer¶
The Anthropic-compatible API (/anthropic/v1/*) enables drop-in replacement for Anthropic SDK clients.
Endpoint Mapping¶
| Anthropic Endpoint | clinvoker Handler |
|---|---|
POST /v1/messages |
POST /anthropic/v1/messages |
Note: GET /v1/models is not implemented.
Request Transformation¶
// Anthropic request format
{
"model": "claude-3-sonnet-20240229",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, Claude!"}
]
}
// Transformed to clinvoker internal format
{
"backend": "claude",
"prompt": "Hello, Claude!",
"options": {
"model": "sonnet"
}
}
Response Transformation¶
// Anthropic response format
{
"id": "msg_01XgY...",
"type": "message",
"role": "assistant",
"content": [
{"type": "text", "text": "Hello! How can I help?"}
],
"model": "claude-3-sonnet-20240229",
"stop_reason": "end_turn",
"usage": {
"input_tokens": 15,
"output_tokens": 10
}
}
Endpoint Routing Architecture¶
Route Registration¶
Routes are registered in internal/server/routes.go:
func (s *Server) RegisterRoutes() {
// Register custom RESTful API handlers
customHandlers := handlers.NewCustomHandlersWithHealthInfo(s.executor, healthInfo)
customHandlers.Register(s.api)
// Register OpenAI-compatible API handlers
openaiHandlers := handlers.NewOpenAIHandlers(service.NewStatelessRunner(s.logger), s.logger)
openaiHandlers.Register(s.api)
// Register Anthropic-compatible API handlers
anthropicHandlers := handlers.NewAnthropicHandlers(service.NewStatelessRunner(s.logger), s.logger)
anthropicHandlers.Register(s.api)
}
Huma Integration¶
clinvoker uses Huma for OpenAPI generation and request/response validation:
huma.Register(s.api, huma.Operation{
OperationID: "create-chat-completion",
Method: http.MethodPost,
Path: "/openai/v1/chat/completions",
Summary: "Create chat completion",
Description: "Creates a completion for the chat message",
Tags: []string{"OpenAI"},
}, func(ctx context.Context, input *ChatCompletionRequest) (*ChatCompletionResponse, error) {
// Handler implementation
})
Middleware Integration¶
Middleware Stack¶
The middleware stack is configured in internal/server/server.go:58-131:
flowchart LR
REQID[RequestID]
REALIP[RealIP]
RECOVER[Recoverer]
LOGGER[RequestLogger]
SIZE[RequestSize]
RATE[RateLimiter]
AUTH[APIKeyAuth]
TIMEOUT[Timeout]
CORS[CORS]
REQID --> REALIP
REALIP --> RECOVER
RECOVER --> LOGGER
LOGGER --> SIZE
SIZE --> RATE
RATE --> AUTH
AUTH --> TIMEOUT
TIMEOUT --> CORS
Request ID Middleware¶
Assigns a unique request ID for tracing:
func RequestID(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
requestID := generateRequestID()
ctx := context.WithValue(r.Context(), requestIDKey, requestID)
w.Header().Set("X-Request-ID", requestID)
next.ServeHTTP(w, r.WithContext(ctx))
})
}
Rate Limiting Middleware¶
Implements token bucket rate limiting:
type RateLimiter struct {
rps float64
burst int
clients map[string]*clientLimiter
mu sync.RWMutex
}
func (rl *RateLimiter) Middleware() func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
clientID := getClientID(r)
if !rl.allow(clientID) {
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
}
Authentication Middleware¶
Validates API keys from multiple sources:
func APIKeyAuth(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
apiKey := extractAPIKey(r)
if apiKey == "" || !isValidAPIKey(apiKey) {
http.Error(w, "Unauthorized", http.StatusUnauthorized)
return
}
ctx := context.WithValue(r.Context(), apiKeyKey, apiKey)
next.ServeHTTP(w, r.WithContext(ctx))
})
}
func extractAPIKey(r *http.Request) string {
// Check Authorization header
if auth := r.Header.Get("Authorization"); auth != "" {
if strings.HasPrefix(auth, "Bearer ") {
return strings.TrimPrefix(auth, "Bearer ")
}
}
// Check X-Api-Key header
if key := r.Header.Get("X-Api-Key"); key != "" {
return key
}
return ""
}
Authentication Design¶
API Key Sources¶
API keys can be provided via:
- HTTP Header:
Authorization: Bearer <key>orX-Api-Key: <key> - Environment Variable:
CLINVK_API_KEY - gopass: Secure password store integration
- Config File:
~/.clinvk/config.yaml
Key Validation¶
func (s *Server) validateAPIKey(key string) bool {
// Check against configured keys
for _, validKey := range s.config.APIKeys {
if subtle.ConstantTimeCompare([]byte(key), []byte(validKey)) == 1 {
return true
}
}
return false
}
Note: subtle.ConstantTimeCompare prevents timing attacks.
Error Handling Strategy¶
Error Response Format¶
All errors follow a consistent format:
{
"error": {
"code": "invalid_request",
"message": "The request body is invalid",
"details": {
"field": "model",
"issue": "required"
}
}
}
Error Types¶
| Code | HTTP Status | Description |
|---|---|---|
invalid_request |
400 | Request validation failed |
authentication_error |
401 | Invalid or missing API key |
rate_limit_exceeded |
429 | Too many requests |
backend_unavailable |
503 | Backend not available |
internal_error |
500 | Internal server error |
Error Handling Middleware¶
func ErrorHandler(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
defer func() {
if rec := recover(); rec != nil {
log.Printf("Panic: %v\n%s", rec, debug.Stack())
respondWithError(w, http.StatusInternalServerError, "internal_error", "Internal server error")
}
}()
next.ServeHTTP(w, r)
})
}
Versioning Approach¶
URL Versioning¶
API versions are included in the URL path:
/api/v1/*- Native API v1/openai/v1/*- OpenAI-compatible v1/anthropic/v1/*- Anthropic-compatible v1
Version Negotiation¶
Future versions may support header-based negotiation:
Deprecation Strategy¶
- Announcement: 6 months notice before deprecation
- Sunset Header: Include
Sunsetheader in responses - Grace Period: Support old version for 3 months after new version release
Request/Response Transformation¶
Unified Options Mapping¶
flowchart TB
subgraph Input["Input Request"]
OPENAI_REQ[OpenAI Format]
ANTH_REQ[Anthropic Format]
NATIVE_REQ[Native Format]
end
subgraph Transform["Transformation Layer"]
MAP[Options Mapper]
end
subgraph Internal["Internal Format"]
UNIFIED[UnifiedOptions]
end
OPENAI_REQ --> MAP
ANTH_REQ --> MAP
NATIVE_REQ --> MAP
MAP --> UNIFIED
Streaming Transformation¶
For streaming responses, data is transformed chunk by chunk:
func (h *OpenAIHandler) streamResponse(ctx context.Context, input *ChatCompletionRequest, w http.ResponseWriter) {
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Connection", "keep-alive")
flusher, ok := w.(http.Flusher)
if !ok {
http.Error(w, "Streaming not supported", http.StatusInternalServerError)
return
}
for chunk := range h.executor.Stream(ctx, input) {
openaiChunk := transformToOpenAI(chunk)
data, _ := json.Marshal(openaiChunk)
fmt.Fprintf(w, "data: %s\n\n", data)
flusher.Flush()
}
fmt.Fprint(w, "data: [DONE]\n\n")
flusher.Flush()
}
Related Documentation¶
- Architecture Overview - High-level system architecture
- Backend System - Backend abstraction layer
- Session System - Session persistence mechanisms
- Reference: REST API - Complete API reference