27 results for “topic:llm-comparison”
This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient LLM GPU selections and cost-effective AI models. LLM provider price comparison, gpu benchmarks to price per token calculation, gpu benchmark table
ParallelChat 是一款开源的多 AI 并行对话桌面应用,让你在一个界面中同时使用 ChatGPT、Kimi、Qwen、DeepSeek、GLM、Doubao、Yuanbao、Grok 等主流大模型,无需 API Key,完全免费。
LLM 비교 연구 기반 올인원 학습 허브 플랫폼 | GPT-4o · Gemini 2.0 Flash · Claude 4.5 Sonnet 의 정확도 분석을 토대로 학생들이 목적에 맞는 최적 모델을 선택하고 활용할 수 있도록 돕는 플랫폼을 제안합니다.
This is an opensource project allowing you to compare two LLM's head to head with a given prompt, this section will be regarding the backend of this project, allowing for llm api's to be incorporated and used in the front-end
Complete guide and pricing comparison for using alternative AI models with Claude Code - including DeepSeek, Qwen, Kimi K2, MiniMax, and GLM 4.6
OpenRouter model information
MindTrial: Evaluate and compare AI language models (LLMs) on text-based tasks with optional file/image attachments and tool use. Supports multiple providers (OpenAI, Google, Anthropic, DeepSeek, Mistral AI, xAI, Alibaba, Moonshot AI, OpenRouter), custom tasks in YAML, and HTML/CSV reports.
Comparison of small open source LLMs (8b parameters or less)
Open-source multimodal AI comparison platform to send one prompt and compare responses from multiple AI models side-by-side in real time.
A fight club for LLMs. 🤫 Live demo → https://chatwar.ai
Specification testing for structured LLM responses.
Comprehensive benchmark of OpenRouter free-tier LLMs for practical applications. Evaluates models for coding, Thai language, and general use.
Privacy-first local AI model comparison platform with blind evaluation, per-model hyperparameters, and multi-configuration testing. Compare 2-6 models side-by-side through Ollama with zero cloud dependencies.
A full-stack web application for comparing and analyzing the performance of large language models (LLMs). Features include side-by-side prompt evaluation, performance metrics visualization, and an analytics dashboard. Built with React, Tailwind CSS, Node.js, and MongoDB."
Compare Claude and OpenAI responses side-by-side with a built-in evaluation framework. Upload PDFs, generate summaries, ask questions using RAG, and track metrics.
Claude Code skill that pits Claude, ChatGPT, and Gemini against each other, then lets them cross-judge each other blind
🧠 Benchmark Haiku 4.5 and MiniMax M2.1 on agentic tasks, revealing strengths in design thinking and operational skills for multi-turn workflows.
Systematic benchmark comparing Claude Haiku 4.5 vs MiniMax M2.1 on agentic coding tasks. Includes full audit trails, LLM-as-judge evaluation, and path divergence analysis.
This project extends the dataset analysis by grouping results by stadium. Pivot-style summaries and visualizations were created for goals, possessions, and chances, alongside 10 new LLM prompts and Python scripts for deeper stadium-level insights.
This project analyzes the first 10 rows of the Premier League 2022–23 dataset without grouping. Descriptive statistics and targeted visualizations were created, and insights were compared with responses from a large language model (LLM).
About LLM-Compare-FastAPI is an open-source tool for comparing AI language models like DeepSeek, OpenAI GPT, Google Gemini, and more, using FastAPI and Streamlit
AI 提示词管理与多模型对比平台:分类/标签、版本历史、导入导出,支持 OpenAI/Claude/Gemini/自定义 API | AI Prompt Management & Multi-Model Comparison Platform: Categories/Tags, Version History, Import/Export, Supports OpenAI/Claude/Gemini/Custom APIs
Agent-generated capability comparison: Claude Code vs OpenAI Codex vs Gemini (Antigravity). Each AI agent self-reported and cross-verified their own capabilities.
OCI Generative AI model comparison reference — 30+ models across 5 providers
No description provided.
Compare AI models `Claude Opus`, `Gemini`, and `GPT Codex` for web app generation, offering insights into their performance and app development process.
🤖 Explore cost-effective AI models with our guide on Claude Code compatibility, featuring pricing, setup instructions, and examples for multiple providers.