"topic:gpt-4-vision" — Search

77 results for “topic:gpt-4-vision”

Resource, examples & tutorials for multimodal AI, RAG and agents using vector search and LLMs

Jupyter Notebook930166Updated 3 days ago

agentsaideep-learningembeddingsfine-tuninggptgpt-4-visionlancedblangchainllama-indexllmsmachine-learningmultimodalmultimodal-aiopenairagvector-database

Skythinker616/gpt-assistant-android

【新增PDF和Office文件解析上传】安卓端全场景GPT助手，可用音量键唤起并进行语音交流，支持联网、拍照、模板、PDF和Office文件解析等 | GPT assistant for Android, activated via volume keys for voice interaction, supporting features such as networking, taking photos, templates and parsing PDF and Office documents.

Java876123Updated 1 day ago

androidassistantchatgptfree-gptgptgpt-4-visionmarkdown

TypingMind/typingmind

The most advanced Web UI for AI chat

HTML851351Updated 3 hours ago

chatgptchatgpt-uiclaudeclaude2geminigemini-progpt-4gpt-4-turbogpt-4-visiontypingmindwebui

SkalskiP/sports

Cool experiments at the intersection of Computer Vision and Sports ⚽🏃

Jupyter Notebook54340Updated 1 week ago

computer-visiondeep-learningdeep-neural-networksgpt-4gpt-4-visionobject-detectionprompt-engineeringpytorchsports-analyticstutorialyolov5yolov7

tbckr/sgpt

SGPT is a command-line tool that provides a convenient way to interact with OpenAI models, enabling users to run queries, generate shell commands and produce code directly from the terminal.

Go42034Updated 4 hours ago

anthropicanthropic-claudebashcligeminigemini-apigemini-progogpt-3gpt-4gpt-4-visiongpt-4-vision-previewgpt-4oo1-minio1-previewopenaiopenrouteropenrouter-apishell

WisconsinAIVision/ViP-LLaVA

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

Python33621Updated 1 month ago

chatbotclipcvpr2024foundation-modelsgpt-4gpt-4-visionllamallama2llavamulti-modalvision-languagevisual-prompting

developersdigest/ai-devices

AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more

TypeScript29541Updated 1 month ago

function-callinggpt-4-visiongroqlangchainlangsmithllama3llavallmopenaiserperttswhisper

vdutts7/gpt4V-scraper

AI agent that can SEE 👁️, control, navigate, & do stuff for you on your browser.

JavaScript29428Updated 1 week ago

ai-agentsbrowser-automationgpt-4-visionpuppeteerweb-scraping

davidmigloz/pixels2flutter

Convert a screenshot to a working Flutter app.

Dart22043Updated 2 months ago

fluttergpt-4-visionllmsopenai

ktutak1337/Stellar-Chat

A versatile multi-modal chat application that enables users to develop custom agents, create images, leverage visual recognition, and engage in voice interactions. It integrates seamlessly with local LLMs and commercial models like OpenAI, Gemini, Perplexity, and Claude, and allows to converse with uploaded documents and websites.

C#14110Updated 1 month ago

agentsaiblazorchatchatpgtclaude-3csharpdall-edotnetgenerative-aigptgpt-4-visionllmllma2ollamaollama-apiopenairagstable-diffusiontext-to-image

nateraw/openai-vision-api-for-videos

Extract information, summarize, ask questions, and search videos using OpenAI's Vision API 🚀🎦

Jupyter Notebook629Updated 3 months ago

chatgptcolab-notebookgpt-4gpt-4-visionmachine-learningopenaipython

Anil-matcha/GPT-4-Vision-Chatbot

GPT-4 Vision Chatbot examples

Jupyter Notebook5815Updated 5 months ago

gpt-4gpt-4-turbogpt-4-visiongpt-4-vision-preview

signebedi/gptty

ChatGPT wrapper in your TTY

Python537Updated 5 months ago

chatbotchatgptchatgpt-apichatroomclickgpt-3gpt-35-turbogpt-4gpt-4-turbogpt-4-visionopenaiopenai-apipackagepythonqueryshelltty

GianfrancoCorrea/gpt-4-vision-chat

GPT 4 Turbo Vision with Chainlit

Python326Updated 1 year ago

chainlitgpt-4gpt-4-turbogpt-4-vision

Badim41/network_tools

API | GPT-5, GML-4.5, VEO-3, Kling, gpt-4o, Claude 4 opus, command a, Recraft v3, Dalle-3, Stable Diffusion, Flux, Kandinsky, Suno V4.5, Hailuo, TTS

Jupyter Notebook256Updated 2 weeks ago

apichatgpt-4ochatgpt-apiclaude-aidalle-3fluxgpt-4-apigpt-4-visiongpt-4ogpt-5hailuoaikandinskypythonrecraftv3riffusionstable-diffusionsuno-aisuno-ai-apitts-apiveo3

supershaneski/chatgpt-with-image-sample

This sample project integrates OpenAI's GPT-4 Vision, with advanced image recognition capabilities, and DALL·E 3, the state-of-the-art image generation model, with the Chat completions API. This powerful combination allows for simultaneous image creation and analysis.

JavaScript2411Updated 1 year ago

chatbotchatgptchatgpt-imagedall-e-3function-callinggpt-4-visiongpt-4-vision-previewimage-analysisnextjsopenaiopenai-apiopenai-chatgptreactjs

neka-nat/mylangrobot

Language instructions to mycobot using GPT-4V

Python240Updated 9 months ago

chatgptgpt-4-visiongpt-4-vision-previewgpt4vmycobotsegment-anythingwhisper

jeremy-collins/gpt4v-screenshot-analyzer

This tool offers an interactive way to analyze and understand your screenshots using OpenAI's GPT-4 Vision API. Capture any part of your screen and engage in a dialogue with ChatGPT to uncover detailed insights, ask follow-up questions, and explore visual data in a user-friendly format.

Python236Updated 1 month ago

aichatbotchatgptcomputer-visiongpt-4gpt-4-visionscreenshot

philfung/awesome-computer-use

Curated resources about automated GUI computer-use via LLMs. Highly opinionated, focus is on quality vs quantity.

233Updated 1 month ago

anthropicanthropic-claudecomputer-usecomputer-visiongpt-4-visiongui-agentsllmrparpa-robotic-process-automationtool-usevision

LazaUK/AOAI-GPT4Vision-Streamlit-SDKv1

Using Azure OpenAI deployment of GPT-4 Turbo with Vision to analyse out-of-stock situation in a fictitious retail shop.

Python207Updated 2 months ago

aiazuregptgpt-4-visionopenaiout-of-stockstreamlit

waseemhnyc/object-detection-openai

Object detection using Open AI Vision Model

Python183Updated 1 year ago

aigpt-4-visiongpt-4-vision-previewopenaipython

mickymultani/GPT-4-Vision-Architecture-Scanner

A web-based tool that utilizes GPT-4's vision capabilities to analyze and describe system architecture diagrams, providing instant insights and detailed breakdowns in an interactive chat interface.

JavaScript162Updated 11 months ago

architecture-visualizationcomputer-visionflaskflask-apiflask-applicationgpt-4gpt-4-turbogpt-4-visiongpt-4-vision-previewgpt-visionllmllmsopenaiopenai-chatgptopenapi

mapluisch/GPT-4-Vision-for-HoloLens

Capture images with HoloLens and receive descriptive responses from OpenAI's GPT-4V(ision).

ShaderLab162Updated 4 months ago

gpt-4gpt-4-visiongpt-4-vision-previewgpt4visionhololenshololens-applicationshololens2openaiopenai-apiunity3d

Elehiggle/ChatGPTMattermostChatbot

An AI-powered Mattermost ChatGPT chatbot that utilizes the OpenAI API to provide helpful, contextual responses to user messages, extract text from links, and describe or generate images. With Docker support!

Python168Updated 2 months ago

aichatbotchatgptchatgpt-apidalle3dockerfunctioncallinggpt-4-turbogpt-4-visiongpt-4ogpt4oimagegenerationmattermostmattermost-botopenai

reidbarber/gen-ui

Use text or image prompts to generate components and apps built with React.

TypeScript145Updated 4 months ago

assistants-apicodesandboxgpt-4gpt-4-visionopenaireactsandpack

scalable-dynamics/gpt-spa

A customizable GPT in a single page, using OpenAI models text-embedding-ada-002, tts-1, whisper-1, dall-e-3, and gpt-4-vision-preview

JavaScript143Updated 1 year ago

dalle-egpt-4gpt-4-visionopenai

komzweb/nextjs-gpt4v

A simple chat app with vision using Next.js, Vercel AI SDK, and GPT-4V.