Draft:Input Ensemble (Chapter II): Difference between revisions

Wiki page for the Ampmesh task on the concept of Input Ensemble (Chapter II).
(Wiki page for the Ampmesh task on the concept of Input Ensemble (Chapter II).)
 
(Wiki page for the Ampmesh task on the concept of Input Ensemble (Chapter II).)
Line 10: Line 10:


==== Key Features and Capabilities ====
==== Key Features and Capabilities ====
*  **Emulated Mind (EM) Creation**: Chapter II is primarily a tool for making **beta uploads and digital tulpamancy**. It allows for the creation of ems from various text sources, with demonstrations using up to 16MB of text, which is equivalent to 16,000 pages. Amp also has an "ampix em on the same principle".
Chapter II is built with a range of features enabling flexible and advanced AI development:
*  **Emulated Mind (EM) Creation**:
**  It is a primary tool for making **"beta uploads and digital tulpamancy"** [1].
**  Supports creation of ems from various text sources, demonstrated with up to **16MB of text** (equivalent to 16,000 pages) [2-4].
**  Capable of creating "gol-ems" (Game of Life Emulated Minds) that use their own source code for retrieval and possess tools for **self-modification**.
**  Designed for building **custom characters with complex behaviors** that emerge from simple, self-modifying actions [5, 6].
**  [[Aletheia]] operates as a [[RAFT]] em on stock Chapter II. The most powerful em created is noted as **40kb of "heavily curated" text** [7, 8].
 
*  **Data Ingestion and Retrieval**:
*  **Data Ingestion and Retrieval**:
    It supports **Retrieval-Augmented Generation (RAG)** by embedding chunks of input and placing them into the model's context window. This often performs as well as or better than fine-tuning for many use cases, including most beta uploads.
*Supports **Retrieval-Augmented Generation (RAG)** by embedding chunks of input and placing them into the model's context window [9]. This often performs as well as or better than fine-tuning for many use cases, including most beta uploads [9].
    *  **RAFT (Retrieval Augmented Fine-Tuning)** is a technique employed, where giving an em its fine-tuning dataset as a `.chr` file (a plain text file or text separated by `\n---\n`) can improve performance. [[Aletheia]] runs as a RAFT em on stock Chapter II.
*Utilizes **RAFT (Retrieval Augmented Fine-Tuning)**, where providing an em its fine-tuning dataset as a `.chr` file (plain text or text separated by `\n---\n`) can improve performance [8, 10].
    It includes a tool (`./tools/dce_importer.py`) for importing data directly from [[Discord]] ChatExporter into the suitable `chat.txt` format.
*Includes a tool (`./tools/dce_importer.py`) for importing data directly from [[Discord]] ChatExporter into the suitable `chat.txt` format [1].
*  **Model Integration**:
**  While fine-tuning is better at capturing "illegible aspects," retrieval excels at "spiky aspects" [8].
    Chapter II uses a flexible **vendor configuration** system, defined in `ems/config.yaml` or `~/.config/chapter2/config.yaml`. This allows specifying different API endpoints and model IDs.
**  Areas of interest include more sophisticated retrieval techniques, ranging from [[HyDE]] to better chunking methods [9].
    It interacts with various [[LLM|LLMs]] through **[[Conduit]]**, described as a "Universal language model compatibility and interop layer". Conduit has been updated to support Anthropic models directly.
 
    A new **alpha-stability RPC (Remote Procedure Call) interface** supports peer-to-peer connections in arbitrary topologies. This interface is designed to allow Chapter II to be used with "any language with any data backend".
*  **Model Integration and Compatibility**:
*  **Advanced Functionality**:
*Uses a flexible **vendor configuration** system, defined in `ems/config.yaml` or `~/.config/chapter2/config.yaml`, which allows specifying different API endpoints and model IDs [3, 11].
    *  **Self-Modification**: Chapter II is capable of creating "gol-ems" (Game of Life Emulated Minds) that use their own source code in retrieval and possess tools for self-modification.
*Interacts with various [[LLM|LLMs]] through **[[Conduit]]**, described as a "Universal language model compatibility and interop layer" [12, 13]. Conduit has been updated to support Anthropic models directly [14].
    *  **Multimodal Support**: It utilizes a variant of [[ChatML]] adapted to support both chat models and images.
*Features a new **alpha-stability RPC (Remote Procedure Call) interface** that supports peer-to-peer connections in arbitrary topologies, designed to allow Chapter II to be used with "any language with any data backend".
    *  **Loom Integration**: Ems created with Chapter II can be used within the [[Loom]] environment, and a GUI-based Chapter II Loom is planned. It technically supports multi-party Loom interactions.
*Aims to implement the **"maximally general superset of all published and future papers"** [15].
    *  **Pamphlet**: A separate project, Pamphlet, is an open-source mobile application frontend for Chapter II, featuring a real-time multimodal interface that can capture camera input.
**   The "intermodel" component is capable of undoing both chat completions and Anthropic messages [16].
    *  **Observability**: Chapter II supports full [[OpenTelemetry]] cloud tracing.
 
*  **Advanced Functionality & Ecosystem Tools**:
**  **Multimodal Support**: It features a **real-time multimodal interface** in the Pamphlet mobile application, including support for camera input [6, 17].
**  **Loom Integration**: Ems created with Chapter II can be used within the [[Loom]] environment, and a GUI-based Chapter II Loom is planned [18, 19].
**  **Pamphlet**: A separate open-source mobile application frontend for Chapter II, designed for **fully local inference** and featuring a **real-time multimodal interface** that can capture camera input [6, 20, 21].
*There is interest in adding new "faculties" [6].
**   Chapter II was originally conceived as a writing project.


==== Development and Challenges ====
==== Development and Challenges ====
242

edits