Blog

11. December 2023

New Feature: Added Vision and Voice Capabilities

I added the voice 🎤 and vision 👀 integration to the ChatGPT Toolkit. You can now speak 🗣 and add screenshots to your chat 📷

chatgpt-toolkit new-feature

Here is the new look of the chat window of the ChatGPT Toolkit:

New Chat Window. You can see the 🗣 and 📸 emojis in the bottom.

Voice Input (Whisper)

Voice is using Whisper via the API. You can just type cmd + . and the recording will start. By hitting enter your recording will be transcribed and send to ChatGPT. Here is an example in combination with the createEvent function:

ChatGPT Voice Input

So far there is no Voice output implemented, yet. Maybe this will come later.

Vision Input 👀

Vision allows you to take a screenshot and feed it to ChatGPT. You can access it via cmd + /. Keep in mind, that image inputs are only available with the GPT-4-Vision-Preview model. So you can't switch between the models in a chat with an image.

ChatGPT Vision Input

Currently it's possible to only use screenshots, but I will add the option to select images from files, too.

Go and play around! If something breaks, let me know ✌

Blog

New Feature: Added Vision and Voice Capabilities

Voice Input (Whisper)

Vision Input 👀

Read more..

Text Replacements on Mac & iOS

Added Anthropic Support & Minor Changes

Blog

New Feature: Added Vision and Voice Capabilities

Voice Input (Whisper)

Vision Input 👀

Read more..

Text Replacements on Mac & iOS

Added Anthropic Support & Minor Changes

Keyboard-First Tools and Tips 🚀