This looks pretty cool: https://www.reddit.com/r/arduino/comments/1czcoe3/vision_questioning_test_with_gpt4o_in_esp32cam/
Posted by user 0015dev:
"The performance of GPT-4o released by OpenAI is excellent. Additionally, you can now ask questions about the vision through the API. I tested encoding the captured JPEG image to BASE64 and sending a message directly using ESP32-CAM"
ChatGPT Client For Arduino Library https://github.com/0015/ChatGPT_Client_For_Arduino (Snapshot here: ChatGPT Client for Arduino )