Álvaro Ramírez

20 November 2024 chatgpt-shell goes multi-model

Over the last few months, I've been chipping at implementing chatgpt-shell's most requested and biggest feature: multi-model support. Today, I get to unveil the first two implementations: Anthropic's Claude and Google's Gemini.

Changing course

In the past, I envisioned a different path for multi-model support. By isolating shell logic into a new package (shell-maker), folks could use it as a building block to create new shells (adding support for their favourite LLM).

While each shell-maker-based shell currently shares a basic common experience, I did not foresee the minor differences affecting the general Emacs user experience. Learning the quirks of each new shell felt like unnecessary friction in developing muscle memory. I also became dependent on chatgpt-shell features, which I often missed when using other shells.

Along with slightly different shell experiences, we currently require multiple package installations (and setups). Depending on which camp you're on (batteries included vs fine-grained control) this may or may not be a downside.

With every new chatgpt-shell feature I showcased, I was often asked if they could be used with other LLM providers. I typically answered with "I've been meaning to work on this…" or "I heard you can do multi-model chatgpt-shell using a bridge like liteLLM". Neither of these where great answers, resulting in me just postponing the chunky work.

Eventually, I bit the bullet, changed course, and got to work on multi-model support. With my initial plan to spin multiple shells via shell-maker, chatgpt-shell's implementation didn't exactly lend itself to support multiple clients. Long story short, chatgpt-shell multi-model support required quite a bit of work. This where I divert to ask you to help make this project sustainable by sponsoring the work.

Make this project sustainable

Maintaining, experimenting, implementing feature requests, and supporting open-source packages takes work. Today, chatgpt-shell has over 20.5K downloads on MELPA and many untracked others elsewhere. If you're one of the happy users, consider sponsoring the project. If you see potential, help fuel development by sponsoring too.

Perhaps you enjoy some of the content I write about? Find my Emacs posts/tips useful?

Blog (xenodium.com) (Web)
Blog (lmno.lol/alvaro) (Web)

Alternatively, you want a blogging platform that skips the yucky side effects of the modern web?

I'm building lmno.lol (my blog is there).

Maybe you enjoy one of my other projects?

Plain Org (org mode / iOS)
Flat Habits (org mode / iOS)
Scratch (org mode / iOS)
macosrec (macOS)
Fresh Eyes (macOS)
dwim-shell-command (Emacs)
company-org-block (Emacs)
org-block-capf (Emacs)
ob-swiftui (Emacs)
chatgpt-shell (Emacs)
ready-player (Emacs)
sqlite-mode-extras
ob-chatgpt-shell (Emacs)
dall-e-shell (Emacs)
ob-dall-e-shell (Emacs)
shell-maker (Emacs)

So, umm… I'll just leave my GitHub sponsor page here.

chatgpt-shell, more than a shell

With chatgpt-shell being a comint shell, you can bring your favourite Emacs flows along.

As I used chatgpt-shell myself, I kept experimenting with different integrations and improvements. Read on for some of my favourites…

A shell hybrid

chatgpt-shell includes a compose buffer experience. This is my favourite and most frequently used mechanism to interact with LLMs.

For example, select a region and invoke M-x chatgpt-shell-prompt-compose (C-c C-e is my preferred binding), and an editable buffer automatically copies the region and enables crafting a more thorough query. When ready, submit with the familiar C-c C-c binding. The buffer automatically becomes read-only and enables single-character bindings.

Navigation: n/p (or TAB/shift-TAB)

Navigate through source blocks (including previous submissions in history). Source blocks are automatically selected.

Reply: r

Reply with with follow-up requests using the r binding.

Give me more: m

Want to ask for more of the same data? Press m to request more of it. This is handy to follow up on any kind of list (suggestion, candidates, results, etc).

Request entire snippets: e

LLM being lazy and returning partial code? Press e to request entire snippet.

Quick quick: q

I'm a big fan of quickly disposing of Emacs buffers with the q binding. chatgpt-shell compose buffers are no exception.

Confirm inline mods (via diffs)

Request inline modifications, with explicit confirmation before accepting.

Execute snippets (a la org babel)

Both the shell and the compose buffers enable users to execute source blocks via C-c C-c, leveraging org babel.

Vision experiments

I've been experimenting with image queries (currently ChatGPT only, please sponsor to help bring support for others).

Below is a handy integration to extract Japanese vocabulary. There's also a generic image descriptor available via M-x chatgpt-shell-describe-image that works on any Emacs image (via dired, image buffer, point on image, or selecting a desktop region).

Supporting new models

Your favourite model not yet supported? File a feature request. You also know how to fuel the project. Want to contribute new models? Reach out.

Local models

While the two new implementations rely on cloud APIs, local services are now possible. I've yet to use a local LLM, but please reach out, so we can make these happen too. Want to contribute?

Should chatgpt-shell rename?

With chatgpt-shell going multi-model, it's not unreasonable to ask if this package should be renamed. Maybe it should. But that's additional work we can likely postpone for the time being (and avoid pushing users to migrate). For now, I'd prefer focusing on polishing the multi-model experience and work on ironing out any issues. For that, I'll need your help.

Take Gemini and Claude for a spin

Multi-model support required chunky structural changes. While I've been using it myself, I'll need wider usage to uncover issues. Please take it for a spin and file bugs or give feedback. Or if you just want to ping me, I'd love to hear about your experience (Mastodon / Twitter / Reddit / Email).

Be sure to update to chatgpt-shell v2.0.1 and shell-maker v0.68.1 as a minimum.
Set chatgpt-shell-anthropic-key or chatgpt-shell-google-key.
Swap models with existing M-x chatgpt-shell-swap-model-version or set a default with (setq chatgpt-shell-model-version "claude-3-5-sonnet-20240620") or (setq chatgpt-shell-model-version "claude-gemini-1.5-pro-latest").
Everything else should just work 🤞😅

Happy Emacsing!

index rss	mastodon twitter github linkedin email
	Álvaro Ramírez sponsor