Draft:Input Ensemble (Chapter II)

This is a draft page; it has not yet been published.

Input Ensemble (Chapter II)

The Input Ensemble is a planned future development within the Chapter II framework, designed to significantly enhance its capabilities for LLM workflows and the creation of AI-powered functions.

Chapter II Overview

Chapter II (often abbreviated as ch2) is a foundational and highly versatile open-source artificial intelligence framework within the Ampmesh ecosystem. It was primarily developed by Joy (as a SERI MATS research project) and Amp, building upon Amp's earlier research on Chapter I.

Core Philosophy and Purpose

Chapter II is designed as the **world's most pluggable and agile framework for creating ems**, aiming for ease of deployment anywhere. Its development stems from a vision to **reimagine an AI stack** that is less influenced by "slop-filled dystopian capitalist hyper growth". A central thesis of Chapter II is that **"the only limit to making an em—both in technical internal functioning and authorial intent—should be the author's imagination"**. The project deliberately eschewed $5 million in funding in 2021, believing in the power of a decentralized network working on a minimalist open-source framework.

Key Features and Capabilities

**Emulated Mind (EM) Creation**: Chapter II is primarily a tool for making **beta uploads and digital tulpamancy**. It allows for the creation of ems from various text sources, with demonstrations using up to 16MB of text, which is equivalent to 16,000 pages. Amp also has an "ampix em on the same principle".
**Data Ingestion and Retrieval**:

   *   It supports **Retrieval-Augmented Generation (RAG)** by embedding chunks of input and placing them into the model's context window. This often performs as well as or better than fine-tuning for many use cases, including most beta uploads.
   *   **RAFT (Retrieval Augmented Fine-Tuning)** is a technique employed, where giving an em its fine-tuning dataset as a `.chr` file (a plain text file or text separated by `\n---\n`) can improve performance. Aletheia runs as a RAFT em on stock Chapter II.
   *   It includes a tool (`./tools/dce_importer.py`) for importing data directly from Discord ChatExporter into the suitable `chat.txt` format.

**Model Integration**:

   *   Chapter II uses a flexible **vendor configuration** system, defined in `ems/config.yaml` or `~/.config/chapter2/config.yaml`. This allows specifying different API endpoints and model IDs.
   *   It interacts with various LLMs through **Conduit**, described as a "Universal language model compatibility and interop layer". Conduit has been updated to support Anthropic models directly.
   *   A new **alpha-stability RPC (Remote Procedure Call) interface** supports peer-to-peer connections in arbitrary topologies. This interface is designed to allow Chapter II to be used with "any language with any data backend".

**Advanced Functionality**:

   *   **Self-Modification**: Chapter II is capable of creating "gol-ems" (Game of Life Emulated Minds) that use their own source code in retrieval and possess tools for self-modification.
   *   **Multimodal Support**: It utilizes a variant of ChatML adapted to support both chat models and images.
   *   **Loom Integration**: Ems created with Chapter II can be used within the Loom environment, and a GUI-based Chapter II Loom is planned. It technically supports multi-party Loom interactions.
   *   **Pamphlet**: A separate project, Pamphlet, is an open-source mobile application frontend for Chapter II, featuring a real-time multimodal interface that can capture camera input.
   *   **Observability**: Chapter II supports full OpenTelemetry cloud tracing.

Development and Challenges

Despite its advanced capabilities and strategic design, Chapter II has faced challenges related to awareness and documentation. Many users, including prominent figures within Ampmesh, have been largely unaware of its full potential, viewing it primarily as "the software that powers Act I". This is seen as a "disrespect" by its creators, given that Act I was merely a "15 line code change to Chapter II".

The documentation has been noted as needing improvement, with multiple individuals writing their own docs but not contributing them back to the main project. There have also been issues with code contributions that were not self-contained.

The Input Ensemble

The `input_ensemble` is envisioned as the **next significant step in Chapter II's evolution**. Its primary goal is to transform Chapter II into a more generalized library for creating **arbitrary LLM-powered functions and workflows in any programming language**.

Specifically, the `input_ensemble` aims to allow for:

**Chaining of Ensembles**: The ability to pass a "query writing em into retrieval".
**Multi-Step Retrieval**: The capacity to "put a retrieval ensemble as input to another to get multi-step retrieval".

This feature is part of a broader interest in adding new "faculties" and deploying Chapter II to create custom characters with complex behaviors that emerge from simple, self-modifying actions.