🐠Butterfish Shell

A shell with AI superpowers

What is this thing?

Butterfish is for people who work from the command line, it adds AI prompting to your shell (bash, zsh) with OpenAI (or compatible APIs). Think Github Copilot for shell.

Here’s how it works: use your shell as normal, start a command with a capital letter to prompt the AI. The AI sees the shell history, so you can ask contextual questions like “Why did that command fail?”.

This is a magical UX pattern – you get high-context AI help exactly when you want it, NO COPY/PASTING.

What can you do with Butterfish Shell?

Once you run butterfish shell you can do the following things from the command line:

“Give me a command to do x”
“Why did that command fail?”
“!Run make in this directory, debug problems” (this acts as an agent)
Autocomplete shell commands (if the AI ‘verbally’ suggested a command it will appear)
“Give me a pasta recipe” (this is a ChatGPT interface so it’s not just for shell stuff!)

Feedback and external contribution is very welcome! Butterfish is open source under the MIT license. We hope that you find it useful!

Installation & Authentication

Butterfish works on MacOS and Linux. You can install via Homebrew on MacOS:

brew install bakks/bakks/butterfish
butterfish shell
Is this thing working? # Type this literally into the CLI

You can also install with go install:

go install github.com/bakks/butterfish/cmd/butterfish@latest
$(go env GOPATH)/bin/butterfish shell
Is this thing working? # Type this literally into the CLI

The first invocation will prompt you to paste in an OpenAI API secret key. You can get an OpenAI key at https://platform.openai.com/account/api-keys.

The key will be written to ~/.config/butterfish/butterfish.env, which looks like:

OPENAI_TOKEN=sk-foobar

It may also be useful to alias the butterfish command to something shorter. If you add the following line to your ~/.zshrc or ~/.bashrc file then you can run it with only bf.

alias bf="butterfish"

Features

Integrates well bash and zsh on MacOS and Linux

When you run butterfish shell, it starts a new instance of your shell, which is probably bash or zsh. zsh is the default on MacOS. If you’ve customized your shell it shouldn’t interfere with that, it intercepts shell io to do useful things!

Start a prompt with a Capital Letter

Within Butterfish Shell you can send a ChatGPT prompt by just starting a command with a capital letter, for example:

$ Summarize the file I just printed

Butterfish Shell is intercepting this and then sending the prompt to ChatGPT.

Manages your shell and prompting history

One of the reasons ChatGPT is so useful is you can carry on a conversation. If the last answer wasn’t good, you can ask to tweak it.

Butterfish Shell gives you the same capability, but also injects your shell history into the chat. Example:

$ find *.go
zsh: no matches found: *.go
$ Why didnt that work?
It looks like....   # this is the ChatGPT output
$ Ok, give me a command that will work.
find . -name "*.go"

So when you talk with ChatGPT, the past questions/answers and the shell output itself is included in the context.

GPT autosuggest

This is like Github Copilot, but in your terminal shell. Butterfish Shell will autosuggest commands which you can apply with Tab.

Like prompting, autosuggest context includes your recent history, so if ChatGPT suggested a command to you, it will likely autosuggest that!

Customizable prompts

Most layers on top of ChatGPT add some language around what you type in to help guide the model to give you the right thing. Often you have no idea what’s being added.

A goal here is to give you control over that language. To that end, the prompt wrappers are visible and editable: they’re kept in ~/.config/butterfish/prompts.yaml.

Select your own model

Butterfish shell defaults to the gpt-4-turbo model, but you can configure it for other models like:

$ butterfish shell -m gpt-3.5-turbo

Goal mode

Butterfish Shell has a feature called Goal Mode, which allows an agent to execute commands on its own to reach a goal. It will give you commands which you execute, and then the results are passed back to ChatGPT. Start a command with ! to engage this mode. Start a command with !! to let it execute commands without confirmation.

!Run pip install in this directory and debug any problems

Goal Mode is pretty hit or miss. Good luck!

Transparent Prompts

Many AI-enabled products obscure the prompt (instructional text) sent to the AI model, Butterfish makes it transparent and configurable.

To see the raw AI requests / responses you can run Butterfish in verbose mode (butterfish shell -v) and watch the log file (/var/tmp/butterfish.log on MacOS). For more verbosity, use -vv.

To configure the prompts you can edit ~/.config/butterfish/prompts.yaml.

The verbose output of Butterfish Shell showing raw AI prompts

Other Model Providers

Butterfish uses OpenAI models by default, but you can instead point it to any server with a OpenAI compatible API with the --base-url (-u) flag. For example, a local model:

butterfish shell -u "http://localhost:5000/v1"

Other AI Tools

Shell Mode is the primary focus of Butterfish but it also includes more specific command line utilities for prompting, generating commands, summarizing text, and managing embeddings of local files.

Goal Mode

If you’re in Shell Mode you can start an agent to accomplish a goal by triggering Goal Mode. Start a command with !, as in !Fix that bug. Goal Mode will populate a command in your shell, which you can execute with Enter, or you can edit the command, or give feedback to the agent by doing a shell prompt (by starting a command with a capital letter). Goal Mode will exit if it decides the goal is met or impossible, or you can manually exit with Ctrl-C.

You can trigger Unsafe Goal Mode by starting a command with !!, which will execute commands without confirmation, and is thus potentially dangerous.

Butterfish Goal Mode trying multiple strategies to accomplish a goal.

Goal Mode Examples

How well does this work? Mileage will vary. Your success rate will be higher with simpler goals and more guidance about how to accomplish them.

The advantages of this feature are that the agent can see your shell history and so it has context of what you’re doing manually and can take over. If a command fails the agent will tweak it and try again.

Some disadvantages are that the agent is biased towards specific versions of commands and may have to experiment to get it right, for example the flags for grep on MacOS are different than on most Linux implementations. The agent isn’t very effective at manipulating large text files like code files, so you will want to be conscious of the context it needs to be successful.

Here are some goals that work well:

!Recursively list the golang files in this directory
!Find the hidden files in this directory and ask me if I want to delete them. This will generally print some things and then wait for user input (provided by prompting starting with a capital letter).
!Show me what process is using the most memory

Here are some goals that work sometimes:

!Run make in this dir, debug problems
!Install python dependencies for this project
!Create a list of the top 3 hacker news headlines, including a link. Use the pup command to parse them out of HTML

Neovim Plugin

butterfish.nvim is a Neovim plugin that enables fluent LLM prompting from within Neovim, for example to rewrite a block of code with specific instructions.

How does Butterfish Shell Work?

Architecture

When you run butterfish shell in your terminal it starts an instance of your shell (e.g. /bin/zsh), then intercepts the shell’s input and output. This is why we call it a “shell wrapper”.
Keyboard input and shell output are buffered in Butterfish’s in-memory history.
Most input is forwarded directly to the shell, but when you start a prompt with a Capital Letter, that input is kept in Butterfish itself.
When you submit a prompt Butterfish will call the OpenAI ChatGPT API, and stream the results back to the terminal (not to the shell).
Autosuggest works like prompting: commands are suggested based on the shell history and what you’re typing, and Butterfish will intercept Tab to apply the suggestion.
What about when you run a child process like vim or ssh? Butterfish watches for child processes and avoids interfering when you run an interactive process.

API Requests

The OpenAI ChatGPT API expects you to submit a “history” of the conversation up to the current prompt. There are 3 kinds of messages in a history:

System Message: This is a set of instructions that tells the AI how to behave. For Butterfish Shell this is something like You an an assistant that helps the user with a Unix shell. You can customize this to whatever you want by editing ~/.config/butterfish/prompts.yaml.
User Messages: These are messages that represent the user’s input to the history. In Butterfish Shell this includes the commands typed into the shell, the shell’s output (i.e. the command output), and also the user’s prompts (e.g. asking a question).
Assistant Messages: These are the AI’s past output. If you asked a question and the AI gave you a previous response, this will appear here. That’s how the assistant knows what to do when you give a prompt like “I don’t like that answer, give me another”.

When Butterfish Shell sends a request to ChatGPT it will use its in-memory buffer to construct a history for the ChatGPT API.

The API has an important constraint: you can only send it so much data at once! This is the number of “tokens” for a model, GPT-3.5 allows an input/output of up to 4096 tokens at once. A token is a GPT-specific way of splitting text, I think of a token as roughly 1 syllable in a word.

Butterfish Shell will fit as much history into an API request as it can. It roughly follows these rules:

Reserve 512 tokens for the answer. The input and output must fit within the model’s token window.
Add items to the API request history until the rest of the tokens are consumed. This includes previous shell output, shell input, and human prompts.
Only use up to 512 tokens for a single history line item. For example, if you printed a huge file to the shell, this will be truncated to 512 tokens.

So remember that the history isn’t infinite, it will include as much recent history as possible, old stuff will eventually be outside of the window sent in a request, and very long command outputs will be truncated.

Butterfish Shell <> System Shell interface

In general Butterfish tries to not interfere with your normal shell operation. An exception to this rule is that Butterfish will edit your shell prompt by default. We do this because:

We add the 🐠 emoji to your prompt as a signal that you’re using Butterfish.
We add the previous command’s status code to the prompt so that Butterfish knows if it succeeded or not. A 0 means a command was successful, non-0 means it failed.
We add the following special characters to the prompt (\033Q, \033R) so that Butterfish can identify a prompt output. These are privately reserved ANSI terminal escape codes that are pretty uncommon, and so generally won’t cause any issues with your workflow.

Command Reference

butterfish shell –help

> butterfish shell --help
Usage: butterfish shell [flags]

Start the Butterfish shell wrapper. This wraps your existing shell, giving you
access to LLM prompting by starting your command with a capital letter. LLM calls
include prior shell context. This is great for keeping a chat-like terminal open,
sending written prompts, debugging commands, and iterating on past actions.

Use:
  - Type a normal command, like 'ls -l' and press enter to execute it
  - Start a command with a capital letter to send it to GPT, like 'How do I
    recursively find local .py files?'
  - Autosuggest will print command completions, press tab to fill them in
  - GPT will be able to see your shell history, so you can ask contextual
    questions like 'why didnt my last command work?'
  - Start a command with ! to enter Goal Mode, in which GPT will act as an Agent
    attempting to accomplish your goal by executing commands, for example '!Run
    make in this directory and debug any problems'.
  - Start a command with !! to enter Unsafe Goal Mode, in which GPT will execute
    commands without confirmation. USE WITH CAUTION.

Here are special Butterfish commands:
  - Help : Give hints about usage.
  - Status : Show the current Butterfish configuration.
  - History : Print out the history that would be sent in a GPT prompt.

If you do not have OpenAI free credits then you will need a subscription and
you will need to pay for OpenAI API use. Autosuggest will probably be the most
expensive feature. You can reduce spend by disabling shell autosuggest (-A) or
increasing the autosuggest timeout (e.g. -t 2000).

Flags:
  -h, --help                       Show context-sensitive help.
  -v, --verbose                    Verbose mode, prints full LLM prompts
                                   (sometimes to log file). Use multiple times
                                   for more verbosity, e.g. -vv.
  -L, --log                        Write verbose content to a log file rather
                                   than stdout, usually /var/tmp/butterfish.log
  -V, --version                    Print version information and exit.
  -u, --base-url="https://api.openai.com/v1"
                                   Base URL for OpenAI-compatible API. Enables
                                   local models with a compatible interface.

  -b, --bin=STRING                 Shell to use (e.g. /bin/zsh), defaults to
                                   $SHELL.
  -m, --model="gpt-4-turbo"        Model for when the user manually enters a
                                   prompt.
  -z, --token-timeout=10000        Timeout before first prompt token is
                                   received and between individual tokens.
                                   In milliseconds.
  -A, --autosuggest-disabled       Disable autosuggest.
  -a, --autosuggest-model="gpt-3.5-turbo-instruct"
                                   Model for autosuggest
  -t, --autosuggest-timeout=500    Delay after typing before autosuggest (lower
                                   values trigger more calls and are more
                                   expensive). In milliseconds.
  -T, --newline-autosuggest-timeout=3500
                                   Timeout for autosuggest on a fresh line, i.e.
                                   before a command has started. Negative values
                                   disable. In milliseconds.
  -p, --no-command-prompt          Don't change command prompt (shell PS1
                                   variable). If not set, an emoji will be added
                                   to the prompt as a reminder you're in Shell
                                   Mode.
  -l, --light-color                Light color mode, appropriate for a terminal
                                   with a white(ish) background
  -H, --max-history-block-tokens=1024
                                   Maximum number of tokens of each block of
                                   history. For example, if a command has a very
                                   long output, it will be truncated to this
                                   length when sending the shell's history.
  -R, --max-response-tokens=2048
                                   Maximum number of tokens in a response when
                                   prompting.

butterfish –help

> butterfish --help
Usage: butterfish <command>

Do useful things with LLMs from the command line, with a bent towards software
engineering.

Butterfish is a command line tool for working with LLMs. It has two modes: CLI
command mode, used to prompt LLMs, summarize files, and manage embeddings, and
Shell mode: Wraps your local shell to provide easy prompting and autocomplete.

Butterfish stores an OpenAI auth token at ~/.config/butterfish/butterfish.env
and the prompt wrappers it uses at ~/.config/butterfish/prompts.yaml. Butterfish
logs to the system temp dir, usually to /var/tmp/butterfish.log.

To print the full prompts and responses from the OpenAI API, use the --verbose
flag. Support can be found at https://github.com/bakks/butterfish.

If you do not have OpenAI free credits then you will need a subscription and you
will need to pay for OpenAI API use. If you're using Shell Mode, autosuggest
will probably be the most expensive part. You can reduce spend by disabling
shell autosuggest (-A) or increasing the autosuggest timeout (e.g. -t 2000).
See "butterfish shell --help".

v0.1.7 darwin amd64 (commit 1c378f3) (built 2023-08-22T19:24:51Z) MIT License -
Copyright (c) 2023 Peter Bakkum

Flags:
  -h, --help       Show context-sensitive help.
  -v, --verbose    Verbose mode, prints full LLM prompts (sometimes to log
                   file). Use multiple times for more verbosity, e.g. -vv.
  -V, --version    Print version information and exit.

Commands:
  shell
    Start the Butterfish shell wrapper. This wraps your existing shell, giving
    you access to LLM prompting by starting your command with a capital letter.
    LLM calls include prior shell context. This is great for keeping a chat-like
    terminal open, sending written prompts, debugging commands, and iterating on
    past actions.

    Use:
      - Type a normal command, like 'ls -l' and press enter to execute it
      - Start a command with a capital letter to send it to GPT, like 'How do I
        recursively find local .py files?'
      - Autosuggest will print command completions, press tab to fill them in
      - GPT will be able to see your shell history, so you can ask contextual
        questions like 'why didnt my last command work?'
      - Start a command with ! to enter Goal Mode, in which GPT will act as
        an Agent attempting to accomplish your goal by executing commands,
        for example '!Run make in this directory and debug any problems'.
      - Start a command with !! to enter Unsafe Goal Mode, in which GPT will
        execute commands without confirmation. USE WITH CAUTION.

    Here are special Butterfish commands:
      - Help : Give hints about usage.
      - Status : Show the current Butterfish configuration.
      - History : Print out the history that would be sent in a GPT prompt.

    If you do not have OpenAI free credits then you will need a subscription and
    you will need to pay for OpenAI API use. Autosuggest will probably be the
    most expensive feature. You can reduce spend by disabling shell autosuggest
    (-A) or increasing the autosuggest timeout (e.g. -t 2000).

  plugin
    Run a ChatGPT Plugin client that allows remote command execution on the
    local machine.

  prompt [<prompt> ...]
    Run an LLM prompt without wrapping, stream results back. This is a
    straight-through call to the LLM from the command line with a given prompt.
    This accepts piped input, if there is both piped input and a prompt then
    they will be concatenated together (prompt first). It is recommended that
    you wrap the prompt with quotes. The default GPT model is gpt-3.5-turbo.

  summarize [<files> ...]
    Semantically summarize a list of files (or piped input). We read in the
    file, if it is short then we hand it directly to the LLM and ask for a
    summary. If it is longer then we break it into chunks and ask for a list of
    facts from each chunk (max 8 chunks), then concatenate facts and ask GPT for
    an overall summary.

  gencmd <prompt> ...
    Generate a shell command from a prompt, i.e. pass in what you want, a shell
    command will be generated. Accepts piped input. You can use the -f command
    to execute it sight-unseen.

  exec [<command> ...]
    Execute a command and try to debug problems. The command can either passed
    in or in the command register (if you have run gencmd in Console Mode).

  index [<paths> ...]
    Recursively index the current directory using embeddings. This will
    read each file, split it into chunks, embed the chunks, and write a
    .butterfish_index file to each directory caching the embeddings. If you
    re-run this it will skip over previously embedded files unless you force a
    re-index. This implements an exponential backoff if you hit OpenAI API rate
    limits.

  clearindex [<paths> ...]
    Clear paths from the index, both from the in-memory index (if in Console
    Mode) and to delete .butterfish_index files. Defaults to loading from the
    current directory but allows you to pass in paths to load.

  loadindex [<paths> ...]
    Load paths into the index. This is specifically for Console Mode when you
    want to load a set of cached indexes into memory. Defaults to loading from
    the current directory but allows you to pass in paths to load.

  showindex [<paths> ...]
    Show which files are present in the loaded index. You can pass in a path but
    it defaults to the current directory.

  indexsearch <query>
    Search embedding index and return relevant file snippets. This uses the
    embedding API to embed the search string, then does a brute-force cosine
    similarity against every indexed chunk of text, returning those chunks and
    their scores.

  indexquestion <question>
    Ask a question using the embeddings index. This fetches text snippets from
    the index and passes them to the LLM to generate an answer, thus you need to
    run the index command first.

Run "butterfish <command> --help" for more information on a command.

About

Butterfish is an open source project written in Golang by Peter Bakkum under the MIT Open Source License.

Project goals:

Make the user faster and more effective using LLMs. This should feel fluent, ergonomic, natural.
Be unobtrusive, don’t break current shell workflows.
Don’t require another window or using mouse.
Be transparent about the exact prompts sent to OpenAI, make these customizable.

This is an experimental tool, I’m eager for feedback. Submit issues, pull requests, etc.

Above all, I hope you find this tool useful!

This page is based on the Command Line Interface Guidelines Hugo template under a Creative Commons Attribution Share Alike 4.0 International license.