AI Tools in Traditional Tabletop Wargaming

Supported by (Turn Off)

Project Entry by dbright

So how do we do this?

April 30, 2023

Tutoring 0

Skill 0

Idea 0

No Comments

This will be a high-level description of how my QA Bot works.

Under the covers, ChatGPT and the APIs we are going to use for our QA bot are converting the text we use into tokens, which are representations of the building blocks of our sentences and language. Think of it as translating our language into a format that the AI can understand. Tokens can be words, punctuation marks, or other meaningful units in the text. On average, a token might be roughly equivalent to 3/4ths of a word, although the actual length can vary. The various OpenAI APIs have a limited capacity for the number of tokens they can process in a single request. This limit applies to the sum of input tokens (your prompt) and the output tokens (the AI-generated response).

The model I am using, gpt-3.5-turbo has a limit of 4,094 tokens. That gives us 3,000-ish words. That might be 10 pages of text. That’s great if you are working with OnePageRules but not so great if you are using a 100+ page rule book. We solve this with something called embeddings.

With embeddings, we convert the text in the PDF into numerical representations that capture the essence of the words and their relationships. To answer our questions, we first search for relevant parts of the text using embeddings, and then use the gpt-3.5-turbo API to generate a response based on the text we found. This is not like using the “Control+F” function to find a word in a PDF; we are not just matching characters. Instead, this approach allows us to search for content that is more closely related to the meaning behind our question, providing a deeper understanding of the text.

So here is the process. There are two phases. First we prepare the PDF for use by our QA Bot and then we have the bot itself.

Preparing the PDF

Divide our PDF into chunks small enough for the API to work with.
Convert the chunks to embeddings.
Store the chunks in a vector store.

A vector store is a specialized type of database designed for storing and managing vectors. The numerical representations of our text, known as embeddings, are multi-dimensional vectors. You may be familiar with vectors in the form of X, Y, Z coordinates, which represent points in 3-dimensional space. In the case of text embeddings, the vectors have many more dimensions, allowing them to capture more complex relationships and patterns in the data. This makes vector stores particularly useful for efficiently handling and searching through large collections of text embeddings.

Our QA app performs the following steps:

Loads the text data from the vector store, creating a searchable database of embeddings.
Searches the vector store to find chunks of text that are relevant to our question.
Constructs a prompt that includes our question and the relevant text chunks, and sends it to the gpt-3.5-turbo API.
Receives the AI-generated answer from the API and displays it to the user.

Clear as mud? It is surprisingly easy to implement if you have a little coding experience, and of course, ChatGPT could even help you write the code, but that is a tale for a different time.