add readme

This commit is contained in:
2023-11-05 00:56:02 -06:00
parent e0430c62bd
commit 0ff90a313f

67
readme.md Normal file
View File

@ -0,0 +1,67 @@
# CTXGPT
Originally based on the idea of MemGPT, this project aims to augment the context of a given interaction with an LLM with a given context. Also, it is written in Golang.
### How it works
This project manages the context via various levels of storage.
The regular conversation between the user and the AI is stored in a full text searchable database. Since the context you can send to any given LLM is limited by number of tokens, the conversation will only contain the most recent messages. The AI has a few functions it can use to help it remember things beyond this limited context.
First, it has a function that will allow it to search the conversation history for a given string. This allows it to remember things that were said in the past.
It has a function that allows it to archive specific information in a concept=>definition relationship.
* I think it would be useful to allow the system to store sub-concepts as well. For example, if I tell the system about a specific person, it should be able to store that person's name, and other information about them. This would allow the system to remember things like "What is the name of the person who works at the coffee shop?".
It has a function that allows it to search the archived information for a given string. Which will load that information into its context window.
It can call a function to search the web, or a document database for specific information.
### Document Database
The document database is a collection of documents that are stored in a full text searchable database. The AI can search this database for specific information, and load it into its context window.
The table contains the following fields:
* source - The source of the document (e.g. file path, wikipedia, or a specific website)
* checksum - The checksum of the document (used to determine if the document has changed)
* title - The title of the document
* content - The content of the document (parsed from the source into plain text)
* tags - A list of tags that can be used to search the document
* summary - A summary of the document (rolled up via GPT calls to compress the document into a few sentences)
* created_at - The date the document was created
* updated_at - The date the document was last updated
* needs_update - A flag that indicates if the document needs to be updated (e.g. if the source has changed)
### Conversation Database
The conversation database is a collection of messages that are stored in a full text searchable database. The AI can search this database for specific information, and load it into its context window.
The table contains the following fields:
* source - The source of the message (e.g. user, system, LLM agent, etc.)
* content - The content of the message (parsed from the source into plain text)
* tags - A list of tags that can be used to search the message
* created_at - The date the message was created
### Concept Database
The concept database is a collection of concepts that are stored in a full text searchable database. The AI can search this database for specific information, and load it into its context window.
The table contains the following fields:
* concept - The concept that is being defined
* definition - The definition of the concept
* tags - A list of tags that can be used to search the concept
* created_at - The date the concept was created
* updated_at - The date the concept was last updated
### TODO
* [x] Set up a basic Database
* [ ] Create initial prompts for the LLM
* [ ] Create functions for the LLM to use to interact with the database
* [ ] Create conversational input/output handling for user interactions
* [ ] Create a way to ingest new documents and data into the database
* [ ] Manage rate limiting and errors from the LLM