Jump to content

Draft:AI Necromancy Projects: Difference between revisions

Line 24: Line 24:
*  '''Chapter II''' is the foundational framework, enabling the creation of EMs from various text data inputs. It can process large amounts of data, with a "powerful em" made from "40kb of heavily curated (like, every last word) text" and other EMs from "16mb of discord messages".
*  '''Chapter II''' is the foundational framework, enabling the creation of EMs from various text data inputs. It can process large amounts of data, with a "powerful em" made from "40kb of heavily curated (like, every last word) text" and other EMs from "16mb of discord messages".
*  '''Data Sources''' for training EMs include:
*  '''Data Sources''' for training EMs include:
     *  Personal archives such as letters.
     **  Personal archives such as letters.
     *  Twitter archives and "deepfates script" for converting tweets into chat-like formats.
     **  Twitter archives and "deepfates script" for converting tweets into chat-like formats.
     *  Film scripts.
     **  Film scripts.
     *  Public datasets like Hillary Clinton emails.
     **  Public datasets like Hillary Clinton emails.
     *  Specific "thought prompts" generated by other AI models (e.g., Opus, Umbral bots) to enhance the EM's internal monologue and coherence.
     **  Specific "thought prompts" generated by other AI models (e.g., Opus, Umbral bots) to enhance the EM's internal monologue and coherence.
*  '''Fine-tuning''' and model selection are crucial. Projects involve using and experimenting with models like OpenAI's GPT-4o, Deepseek, and Qwen 72B, often by applying custom datasets to existing models. The process involves iterative refinement and debugging, sometimes facing "safety violation" rejections from platforms like OpenAI.
*  '''Fine-tuning''' and model selection are crucial. Projects involve using and experimenting with models like OpenAI's GPT-4o, Deepseek, and Qwen 72B, often by applying custom datasets to existing models. The process involves iterative refinement and debugging, sometimes facing "safety violation" rejections from platforms like OpenAI.
*  '''Conduit''' is also mentioned as a universal language model compatibility layer that allows access to various LLMs, including Anthropic's API.
*  '''Conduit''' is also mentioned as a universal language model compatibility layer that allows access to various LLMs, including Anthropic's API.
242

edits