82 results for “topic:ai-ops”
AI-powered SRE platform for automated incident investigation
HoloInsight is a cloud-native observability platform with a special focus on real-time log analysis and AI integration.
GMSSH: Desktop-Grade AI-Driven Operations Terminal High Performance · Non-Intrusive · AI-Powered;GMSSH 桌面级 AI 运维终端.高性能·AI 智驱
autonomous systems engineering cli agent for any cloud environment: AWS, GCP, Cloudflare, etc
🛡️Decision infrastructure for AI agents. Intercept actions, enforce guard policies, require approvals, and produce audit-ready decision trails.
🏰 Real-time dashboard for monitoring and managing Clawdbot AI agents
InsightSolver: Colab notebooks for exploring and solving operational issues using deep learning, machine learning, and related models.
An autonomous SRE agent that monitors cloud logs across multiple platforms, leveraging AI models from various providers to detect anomalies, perform root cause analysis, and automate remediation by creating GitHub Pull Requests.
Lightweight server management & intelligent ops platform with Docker one-click deployment, batch script execution, web terminal, and AI-powered operations.
Founder-first meta-layer to generate production-ready OpenClaw multi-agent setups from one prompt: agents, skills, workflows, memory, and install docs.
AI that ships your code. Deploy to any cloud with plain English.
ARF is an agentic reliability intelligence platform that separates decision intelligence (OSS) from governed execution (Enterprise), enabling autonomous operations with deterministic safety guarantees.
SDK to track cost-per-outcome for AI workflows
🐝 引巢 · ZyHive | AI 团队操作系统 — 为 AI 成员注入灵魂,可视化管理多智能体团队。Self-hosted AI Team OS · Go + Vue 3 · One-click deploy
Advanced, end-to-end, enterprise-grade agentic AI pipeline that automates competitor ad intelligence, performs multimodal creative strategy extraction, enables brand-safe adaptation, and generates AI video ads using LLM reasoning, multimodal analysis, and deterministic workflow orchestration with full auditability.
MCP Kill-switch for AI agents. Validates infrastructure operations before execution. Fail-closed. Evidence-backed.
Open source code for AIOpsServing
Hypothesis-driven AI agent for incident investigation. AWS, K8s, PagerDuty.
n8n workflows for Kubernetes monitoring, diagnostics, and automated remediation
AI-powered open-source monitoring platform with auto-remediation. 6 built-in runbooks, MCP integration (global first), DeepSeek root cause analysis. 5-minute Docker setup.
Deploy an AI-powered chatbot for your repo in under 2 minutes.
Real-world patterns for shipping AI agents to production. Learn versioning, cost optimization, multi-tenancy, guardrails, and observability through runnable TypeScript examples.
tpu-doc is a zero-dependency diagnostic binary for Google Cloud TPU environments that instantly validates hardware health, discovers software stack configurations, and provides AI-powered log analysis to eliminate expensive debugging downtime.
Deterministic first. Fetching, filtering, and routing are plain code. AI is only used for judgment calls: classification, drafting, summarization.
🤖 Build and deploy scalable Multi-AI Agent systems with LangGraph and Groq LLMs to enhance intelligence across enterprise applications.
🚀 Enhance Google Cloud operations with the Gemini SRE Agent, automating log monitoring and incident response for smarter site reliability.
Automated Runbook Engine for Incident Response, CLI, command line documentation, and AI-assisted Operations
The AI Operations Control Plane — monitor, govern, and automate your AI agents
The hundred-eyed watcher for your LLM providers. Monitor uptime, TTFT, TPS, and latency across OpenAI, Anthropic, Azure, Bedrock, Ollama, LM Studio, and 100+ providers through a single dashboard. Benchmark, compare, and get alerts — all self-hosted.
Production-ready MLOps platform for monitoring and evaluating LLM response quality with automated alerts and real-time analytics