Editing Draft:Chapter I (section)

=== Technical Aspects and Chapter II Connection ===
[[Chapter II]] is the practical realization of the theoretical research embodied in Chapter I, developed as a highly pluggable and agile framework for creating emulated minds. It was developed to be "easy for an LLM to understand" and incorporates "lots of theoretical research on how to do it optimally". Chapter II was notably a [[SERI MATS]] research project. Amp and Joy notably refused $5 million in funding in 2021, believing that a decentralized network could more effectively compete than a centralized company.

Key technical features and design principles of Chapter II include:

* '''Architecture''': Chapter II uses a variant of [[ChatML]] adapted to support chat models and images. It includes support for full [[OpenTelemetry]] cloud tracing.
* '''Configuration''': Emulated minds are loaded from an "ems" folder, each requiring a `config.yaml` file to define its configuration. The configuration keys are defined in `./chapter2/ontology.py`, which was previously named `resolve_config.py`.
* '''Data Import''': A tool (`./tools/dce_importer.py`) is provided for importing data directly into a suitable format from DiscordChatExporter. The default `chat.txt` format is IRC-style (` Hi!`), with `---\n` enabling multiline support for messages.
* '''Retrieval-Augmented Fine-tuning (RAFT)''': Chapter II utilizes retrieval by embedding chunks of input and placing them into the context window. This technique often performs as well as or better than traditional fine-tuning for many use cases, including most beta uploads. Providing an em its fine-tuning dataset as a `.chr` file (a form of RAFT) also improves performance, requiring the data to be reformatted into raw `.txt` or `.txt` separated by `\n---\n`.
* '''Development Challenges''': The project has faced challenges with disorganized and scattered documentation across various individuals and Discord channels, with multiple developers not pushing their documentation efforts. Additionally, Amp has described the ongoing effort to maintain the Chapter II project as "exhausting," fighting to keep it on "life support" despite its significance as "one of the most important AI research projects of all time". There was also an instance where developer Janus added a "thousand lines of non-self-contained code" that later required cleanup.
* '''Future Goals''': Joy aims to further develop Chapter II into a library for creating LLM workflows in any language and for constructing arbitrary functions, with `input_ensemble` as a step towards multi-step retrieval (e.g., passing a query-writing em into retrieval). Amp also intends to replace the existing `/v1` API, which is described as a "legacy API with many self-incompatibilities invented in 2021 in a hurry," with a `/v2/continuations` API if no one else does.