Using OpenAI Assistants API – Part 3

In Part 1 and Part 2 of this series we explored how to use the new OpenaAI’s Threads API using Python; in this article we will look at how to access the same API from a Golang backend application.

Overview

This is the API Backend server that I use for my Majordomo Code Assistant, which I have recently updated from the legacy “completions” API.

While the full code is not available (yet), we will share enough code snippets to get a sense of how this can be done, alongside with ideas for future developments.

Accessing the OpenAI API

As a Go wrapper around the API calls, we use Alex Baranov excellent go-openai library:

go get github.com/sashabaranov/go-openai

Creating a client then is as easy as calling NewClient with an appropriate API key:

type Majordomo struct {
    // The OpenAI Client
    Client *openai.Client
    // The Code Snippets CodeStore
    CodeStore preprocessors.CodeStoreHandler
    // The Model to use
    Model string
    // The configuration object to manage the Projects in the server handlers
    Config *config.Config
}

func NewMajordomo(cfg *config.Config) (*Majordomo, error) {
    var assistant = new(Majordomo)
    assistant.Client = openai.NewClient(cfg.OpenAIApiKey)
    if assistant.Client == nil {
        return nil, fmt.Errorf("error initializing OpenAI client")
    }

    // The LLM Model to use.
    if cfg.Model == "" {
        assistant.Model = DefaultModel
    } else {
        assistant.Model = cfg.Model
    }
    // other initialization stuff
}

Aside: OpenAI API Keys
Currently, OpenAI has introduced the concept of Projects, as well as Organizations, and API keys can now be organization- and project-scoped.

Assistants are project-scoped, so if one is using a “legacy” user-scoped API key, the Client ought to be initialized with an appropriate project-id value (see the Python examples in Part 2 for more details).

This is not yet possible in go-openai; however, creating a project-scoped API key (which is best-practice anyway, for better isolation and access control) we achieve the same desired result, in which API calls will retrieve Assistants that are created within the Project’s scope.
If you use the code here, and your calls only retrieve Assistants related to the Default Project that is the reason; please create an API key from within the project’s configuration settings in OpenAI portal, and it should all “work as intended.”

Creating Assistants

This is extremely simple, using the CreateAssistant API call, passing an AssistantRequest with the name, model to use and the “instructions” for the assistant:

type Assistants struct {
    Common       string            `yaml:"common"`
    Instructions map[string]string `yaml:"instructions"`
}


func (m *Majordomo) CreateAssistants(assistants *Assistants) error {


    for name, instructions := range assistants.Instructions {
        if existingAssistants.Contains(name) {
            // Should update instead - see Note below
            continue
        }
        inst := fmt.Sprintf("%s\n%s", assistants.Common, instructions)
        a, err := m.Client.CreateAssistant(ctx, openai.AssistantRequest{
            Model:        m.Model,
            Name:         &name,
            Description:  nil,
            Instructions: &inst,
        })
        if err != nil {
            log.Err(err).Str("assistant", name).Msg("error creating assistant")
            return err
        }
    }

There is a lot more that can be done with Assistants, and more arguments that can be configured: see the OpenAI API docs to learn more.

At its heart, the concept of an “assistant” is a glorified prompt engineering (at least in the basic form used here) so we use “instructions” here in the form of a “common” prompt (for all assistants) in addition to more specialized instructions for each one of them:

# Copyright (c) 2023-2024 AlertAvert.com. All rights reserved.
# Author: Marco Massenzio (marco@alertavert.com)

common: |
  You are Majordomo, a coding assistant for an experienced developer and only send back the code, no explanation necessary. 
  ...  

  Finally, please make sure to use the correct file extension for the code you are sending back.

instructions:
  go_developer: |
    All the code is GoLang (or shell scripts); and will help me to build a complete system. 

    This is what the code should look like:
    '''cmd/main.go
        package main

        func main() {
        fmt.Println("this is an example")
    }
    '''
    '''pkg/server/server.go
        package server

        func server() {
        ...
    }
    '''
    ... etc.
  another: |
    More instructions here

  and_another_one: |
    You can have many assistants!!

Aside: Assistants’ names are not unique
Assistants are uniquely identified by their assistant_id (something that looks like asst_adfe445d...), so one can create many assistants with the same name within the same project: while this is allowed (and may even make sense, even though I fail to see how) it is likely to be utterly confusing to your user.

My solution, in my application design, is to assume that assistants’ names are unique; so asking to “create” one that already exists, only means that (at most) it will be an update request.

Your choice may be different, just please be aware that OpenAI will happily create a new assistant with the same name as one already existing, even with identical instructions, just a different ID.

Asking the Assistant for Help

Similarly to what we did in Python, we will create a Thread, associated it with an Assistant, then add the user’s prompt(s), create a Run, associated it with the Thread, and execute it, until it either completes (returning the Assistant’s response) or fails (hopefully, with a meaningful error message).

Obviously, while executing (and before it completes) the user could cancel the Run, in which case an appropriate completion status is reported.

The code below shows the steps; it is a somewhat simplified version of what I have implemented, and glosses over a number of details, but should give a good sense of what’s what.

A PromptRequest is sent to our Application’s API, for processing:

type PromptRequest struct {
    // The assistant to use (selected by the user).
    Assistant string `json:"assistant"`
    // The Thread ID (if any) to keep track of past prompts/responses in the conversation.
    // If empty, a new conversation is started.
    ThreadId string `json:"thread_id,omitempty"`
    // The user prompt.
    Prompt string `json:"prompt"`
}

This could be for an existing Thread, or requesting the creation of a new one:

// CreateNewThread creates a new thread for the given project and returns the thread ID.
func (m *Majordomo) CreateNewThread() string {
    t, err := m.Client.CreateThread(context.Background(), openai.ThreadRequest{})
    if err != nil {
        log.Err(err).Msg("error creating thread")
        return ""
    }
    return t.ID
}

The next segment of code is pretty long, I have removed most of the error checking and logging for clarity (I will explain what the CodeStore does further below):

// QueryBot queries the LLM with the given prompt.
func (m *Majordomo) QueryBot(prompt *PromptRequest) (string, error) {

    // Create a new conversation if the thread ID is empty.
    if prompt.ThreadId == "" {
        prompt.ThreadId = m.CreateNewThread()
    }

    // Creates a new conversation in the thread.
    msg, err := m.Client.CreateMessage(context.Background(), prompt.ThreadId,
        openai.MessageRequest{
            Role:     "user",
            Content:  prompt.Prompt,
        })
    if err != nil {
        return "", err
    }

    // Find the assistant ID, given its name.
    assistantId, err := m.GetAssistantId(prompt.Assistant)

    // Create a Run - the model is set already in the Assistant.
    run, err := m.Client.CreateRun(context.Background(), 
        prompt.ThreadId, openai.RunRequest{
            Model:     m.Model,
            AssistantID: assistantId,
        })

    done := false
    // Get the response from the model.
    for !done {
        resp, err := m.Client.RetrieveRun(context.Background(), run.ThreadID, run.ID)
        if err != nil {
            return "", fmt.Errorf("error getting run: %v", err)
        }
        switch resp.Status {
        case openai.RunStatusInProgress, openai.RunStatusQueued:
            // This should have a configurable interval, or maybe exponential backoff.
            time.Sleep(5 * time.Second)
        case  openai.RunStatusCompleted:
            done = true
        case openai.RunStatusFailed:
            return "", fmt.Errorf("run failed: %v", resp.LastError.Message)
        case openai.RunStatusCancelled, openai.RunStatusCancelling, openai.RunStatusExpired:
            return "", fmt.Errorf("run cancelled or expired")
        case openai.RunStatusRequiresAction:
            return "", fmt.Errorf("action required")
        default:
            return "", fmt.Errorf("unexpected run status: %s", resp.Status)
        }
    }

    // Retrieve the most recent message in the Thread.
    messages, err := m.Client.ListMessage(context.Background(), prompt.ThreadId, nil, nil, nil, nil)
    if err != nil {
        return "", fmt.Errorf("error listing messages: %v", err)
    }
    log.Debug().
        Int("messages", len(messages.Messages)).
        Str("last_id", *messages.LastID).
        Str("first_id", *messages.FirstID).
        Msg("messages")
    // Should actually use the FirstID instead, and validate it's from `assistant`.
    botMessage := messages.Messages[0]
    // There is a lot more information in the response that can be used to 
    // further manipulate the messages, and inform the user.
    if len(botMessage.Content) != 1 {
        log.Warn().
            Int("content_len", len(botMessage.Content)).
            Msg("unexpected content length")
    }
    botSays := botMessage.Content[0].Text.Value

    // Parse the response from the model.
    parser := preprocessors.Parser{
        CodeMap: make(preprocessors.SourceCodeMap),
    }
    err = parser.ParseBotResponse(botSays)
    err = m.CodeStore.PutSourceCode(parser.CodeMap)
    return botSays, nil
}

The ListMessage type returns a couple of indexes: *messages.LastID and *messages.FirstID, which can actually be used to retrieve messages: unfortunately, the OpenAI docs are very laconic (well, I’m being generous here: they actually say squat about either, let alone how to use them) but some experimentation shows that (counterintuitively) FirstID is the one that should be used, as it’s the “most recent” (wouldn’t that be the “latest,” though?) message returned.

Also, somewhat confusingly, that appears to (consistently) be the first message in the returned Messages slice – so there’s that; just beware, if you are parsing the structure, the most recent, last message will be the first one returned.

Code Snippets

As soon as you start using GPT for code generation and general assistance, it immediately becomes obvious that the continuous copy & paste of code to/from prompts is going to be extremely tedious.

What I have done is to implement a “parser” that will insert the code from the appropriate file into the prompt, and extract the code from GPT’s response, and save it to disk to enable editing, comparing, etc.

This is not strictly related to using the Assistants API, so I won’t be showing too much of the code here, but I promise a future blog post on the topic (also, I want to improve the API and add broader parsing ability and command processing).

The TL;DR is that we parse the prompt (from the user) for code patterns and populate it with the file’s contents; and parse the response (from GPT) for the same, and save it to disk:

// Some simplified RegEx to find code snippets in prompts
const (
    FilepathPattern    = `^/?([\w.-]+/?)+$`
    CodeSnippetPattern = `'''([\w/.]+/?)\n([\s\S]+?)'''`
    PromptCodePattern  = `'''([\w/.]+/?)\n'''`
)

// SourceCodeMap is a map of file paths to their contents
type SourceCodeMap = map[string]string


// ParsePrompt finds all the code snippets in the prompt and extracts their paths
// from the prompt to prepare the CodeMap to be populated by a CodeStoreHandler.
func (p *Parser) ParsePrompt(prompt string) {
    matches := promptRegex.FindAllStringSubmatch(prompt, -1)
    for _, match := range matches {
        p.CodeMap[match[1]] = ""
    }
}

// FillPrompt fills in the code snippets in the prompt, given their file paths.
func (p *Parser) FillPrompt(prompt string) (string, error) {
    matches := promptRegex.FindAllStringSubmatch(prompt, -1)

    for _, match := range matches {
        filePath := match[1]
        content, found := p.CodeMap[filePath]
        if content == "" || !found {
            return "", errors.New(fmt.Sprintf(ErrorNoCodeSnippetsFound, filePath,
                "no entry in map"))
        }
        replacementRegex := regexp.MustCompile(fmt.Sprintf(`'''%s\n'''`, filePath))
        prompt = replacementRegex.ReplaceAllLiteralString(prompt,
            fmt.Sprintf("'''%s\n%s'''", filePath, content))
    }
    return prompt, nil
}

Crucially, as we cannot trust GPT not to hallucinate or produce bad code, we keep track of separate source directory (where the user, trusted code lives) and destination directory (where the bot’s somewhat flaky code) is stored:

// A CodeStoreHandler interface abstracts the storage layer for the code
// snippets via a SourceCodeMap.
type CodeStoreHandler interface {
    // GetSourceCode fills in the code snippets, given their file paths
    GetSourceCode(codemap *SourceCodeMap) error

    // PutSourceCode will store the code snippets, based on their file paths
    PutSourceCode(codemap SourceCodeMap) error
}


// FilesystemStore is a CodeStoreHandler that reads and writes code snippets from/to the filesystem
type FilesystemStore struct {
    // SourceCodeDir is the directory where the code snippets are read from
    SourceCodeDir string
    // DestCodeDir is the directory where the code snippets are saved to
    DestCodeDir string
}

Aside: GPT-4 is still not quite good as a coder

I’ll be honest, I would hold on to your developers if you are an over-eager CEO prone to believe the AI hype: the code returned is mediocre at best; it usually fails to compile; and most of the time doesn’t really do what one hopes it should (and GPT thinks it does).

Certainly, for simple tasks (such as parsing CSV, YAML or JSON files; creating/parsing API requests/responses; etc.) the code is usually good and can be used with minimal changes; but trying to ask it to complete more complex tasks (writing a batter of unit tests using Ginkgo/Gomega; or something with a full logic like what we’re demonstrating here) it really fails spectacularly.

Put another way: I’m pretty sure I got GPT to write the code for the ParsePrompt and FillPrompt functions, and probably only lightly edited them to suit my coding foibles; but telling it to come up with the full CodeStore/Parser shebang would have been utterly beyond its abilities.

Sorry, dear CEO, you still need to deal with obnoxious nerds like yours truly for quite a while yet.

Putting it all together

When this is wrapped inside an API server, we can ask it to do stuff for us:

└─( make dev
Building rel. v0.5.0-gdc038be; OS/Arch: darwin/arm64 - Pkg: github.com/alertavert/gpt4-go
Majordomo Server majordomo-v0.5.0-gdc038be_darwin-arm64 built
build/bin/majordomo-v0.5.0-gdc038be_darwin-arm64 -debug -port 5005
{"level":"info","time":"2024-06-01T17:25:13-07:00","message":"Starting >>> Majordomo Server <<< Rel. v0.5.0-gdc038be >>>"}

...


└─( http :5005/assistants

HTTP/1.1 200 OK

[
...
    {
        "created_at": 1710131017,
        "id": "asst_yXpLauk...",
        "instructions": "You are an expert GoLang developer and will help me to address code issues and finding missing packages.",
        "model": "gpt-4-turbo-preview",
        "name": "go_developer",
        "object": "assistant",
        "tools": [
            {
                "type": "code_interpreter"
            }
        ]
    },
...

]

└─( http POST :5005/prompt \
    assistant=go_developer \
    prompt="Explain the code in the ParsePrompt() function\n'''pkg/preprocessors/parser.go\n'''\nUse Markdown syntax and make sure to use headings/sections as appropriate"

{"level":"debug","prompt_len":4592,"old_len":155,"code_snippets":1,"time":"2024-06-01T17:41:49-07:00","message":"filled prompt"}
{"level":"debug","content_len":1,"assistant":"Golami","thread_id":"thread_aI6uqEJmyzMbZrNa5ZxdpyJ6","model":"gpt-4","time":"2024-06-01T17:41:49-07:00","message":"querying LLM"}
{"level":"debug","assistant_id":"asst_yXpLaukp0gv2Jt151DIK01yS","assistant":"Golami","time":"2024-06-01T17:41:49-07:00","message":"assistant found"}
{"level":"debug","run_id":"run_FGDo7MTCYZ4axRXLwPLC0ths","thread_id":"thread_aI6uqEJmyzMbZrNa5ZxdpyJ6","assistant_id":"asst_yXpLaukp0gv2Jt151DIK01yS","time":"2024-06-01T17:41:49-07:00","message":"created run"}
{"level":"debug","tokens":1990,"time":"2024-06-01T17:42:15-07:00","message":"run completed"}
[GIN] 2024/06/01 - 17:42:15 | 200 |   26.5573545s |             ::1 | POST     "/prompt"

This is the full response from GPT-4; as I said already, it’s not bad, but it completely missed the point…

Sure, the GoLang code you shared is defining data structures and methods to handle parsing of code snippets and interacting with them. Here’s what each part of the code does, sectioned by functionality.

Package Imports

The script imports necessary packages required to handle file systems, errors, string formatting, and regular expressions.

Constants

This section defines some string constants for various error messages and regular expression patterns.

Types

SourceCodeMap

is a type alias for a map structure that maps string keys (file paths) to string values (contents of files at the associated paths).

Parser

is a type/struct that has a field using the type . This structure would be used to hold the parsed code snippets.

CodeStoreHandler

is an interface that defines the methods required by any code storage. Any type that implements and methods will be considered to be implement the interface.

Initialization and Encryption Variables

Here, regular expressions that match filepath patterns and code snippets in input and response strings are compiled.

Methods

IsValidFilePath

Checks if a provided string is a valid filepath.

ParseBotResponse

This method parses a bot response and extracts the file path and corresponding code snippets, adding them to the data structure. It also handles invalid file paths.

ParsePrompt

function parses the input prompt for any file paths wrapped in markdown backticks and adds them to the data structure with their value set as an empty string.

FillPrompt

After the has been populated by a , function will substitute each file path in the prompt with its corresponding code snippet from the .

FileSystemStore

The struct and its methods implement the interface for a filesystem-based code store.

It has two fields: which is the directory where the code snippets are read from, and which is the directory where the code snippets are saved to.

reads the files at each path in the CodeMap from the filesystem and fills in the content.
takes the content of each file in the CodeMap and writes it to the filesystem at its respective path. function is a constructor for and returns a new that operates on the filesystem.

Code Trips & Tips