Draft:Input Ensemble (Chapter II): Difference between revisions
Wiki page for the Ampmesh task on the concept of Input Ensemble (Chapter II).
Extrahuman (talk | contribs) (Wiki page for the Ampmesh task on the concept of Input Ensemble (Chapter II).) |
Extrahuman (talk | contribs) (Wiki page for the Ampmesh task on the concept of Input Ensemble (Chapter II).) |
||
Line 10: | Line 10: | ||
==== Key Features and Capabilities ==== | ==== Key Features and Capabilities ==== | ||
* **Emulated Mind (EM) Creation**: | Chapter II is built with a range of features enabling flexible and advanced AI development: | ||
* **Emulated Mind (EM) Creation**: | |||
** It is a primary tool for making **"beta uploads and digital tulpamancy"** [1]. | |||
** Supports creation of ems from various text sources, demonstrated with up to **16MB of text** (equivalent to 16,000 pages) [2-4]. | |||
** Capable of creating "gol-ems" (Game of Life Emulated Minds) that use their own source code for retrieval and possess tools for **self-modification**. | |||
** Designed for building **custom characters with complex behaviors** that emerge from simple, self-modifying actions [5, 6]. | |||
** [[Aletheia]] operates as a [[RAFT]] em on stock Chapter II. The most powerful em created is noted as **40kb of "heavily curated" text** [7, 8]. | |||
* **Data Ingestion and Retrieval**: | * **Data Ingestion and Retrieval**: | ||
** Supports **Retrieval-Augmented Generation (RAG)** by embedding chunks of input and placing them into the model's context window [9]. This often performs as well as or better than fine-tuning for many use cases, including most beta uploads [9]. | |||
** Utilizes **RAFT (Retrieval Augmented Fine-Tuning)**, where providing an em its fine-tuning dataset as a `.chr` file (plain text or text separated by `\n---\n`) can improve performance [8, 10]. | |||
** Includes a tool (`./tools/dce_importer.py`) for importing data directly from [[Discord]] ChatExporter into the suitable `chat.txt` format [1]. | |||
* **Model Integration**: | ** While fine-tuning is better at capturing "illegible aspects," retrieval excels at "spiky aspects" [8]. | ||
** Areas of interest include more sophisticated retrieval techniques, ranging from [[HyDE]] to better chunking methods [9]. | |||
* **Model Integration and Compatibility**: | |||
* ** | ** Uses a flexible **vendor configuration** system, defined in `ems/config.yaml` or `~/.config/chapter2/config.yaml`, which allows specifying different API endpoints and model IDs [3, 11]. | ||
** Interacts with various [[LLM|LLMs]] through **[[Conduit]]**, described as a "Universal language model compatibility and interop layer" [12, 13]. Conduit has been updated to support Anthropic models directly [14]. | |||
** Features a new **alpha-stability RPC (Remote Procedure Call) interface** that supports peer-to-peer connections in arbitrary topologies, designed to allow Chapter II to be used with "any language with any data backend". | |||
** Aims to implement the **"maximally general superset of all published and future papers"** [15]. | |||
** The "intermodel" component is capable of undoing both chat completions and Anthropic messages [16]. | |||
* **Advanced Functionality & Ecosystem Tools**: | |||
** **Multimodal Support**: It features a **real-time multimodal interface** in the Pamphlet mobile application, including support for camera input [6, 17]. | |||
** **Loom Integration**: Ems created with Chapter II can be used within the [[Loom]] environment, and a GUI-based Chapter II Loom is planned [18, 19]. | |||
** **Pamphlet**: A separate open-source mobile application frontend for Chapter II, designed for **fully local inference** and featuring a **real-time multimodal interface** that can capture camera input [6, 20, 21]. | |||
** There is interest in adding new "faculties" [6]. | |||
** Chapter II was originally conceived as a writing project. | |||
==== Development and Challenges ==== | ==== Development and Challenges ==== |