Draft:Falcon Models
Falcon Models Edit
Falcon Models are a family of large language models referenced within the Ampmesh ecosystem, particularly noted for their distinct behavioral traits and their role in various AI Entities and experimental setups.
Key Characteristics and Capabilities Edit
Falcon models exhibit several notable characteristics:
- Performance and Quantization: Falcon3-7B appears to be **less prone to damage from quantizing compared to Llama3-8B**. It also demonstrates a **better vocabulary range**.
- Emotional and Literary Output: Falcon3-7B is observed to write a lot more **emotionally than Llama3** and is noted for being "very good at poetry and rhyming".
- Stylistic Biases: Falcon3-7B tends to exhibit a **hard bias towards Western fiction** when prompted with obscure anime. It also has a specific formatting preference, often placing a hyphen (`- `) before metadata headers. When writing fanfiction, it frequently lists "canonical characters" separately from "characters", leading to speculation about dataset augmentation.
- Contextual Adaptation: Falcon3 models show limitations in adapting to dynamic situations, such as Twitch livestreams, where it "does not seem to adapt to the situation at all".
- Dataset Characteristics: Falcon3's dataset is believed to be **heavily augmented with fact checks and formatting cleanup**.
- IRC Format Issues: Unlike older Falcon-7B models, Falcon3-7B **does not consistently recognize real IRC log formats** and may generate fictional ones, suggesting a sanitization into an artificial format.
Specific Falcon Models and Their Usage Edit
- Falcon1: This model serves as the base for the AI agent Aoi. Aoi operates on dedicated GPUs with fixed limits and load balancing with prioritization, enabling easy scaling and perfect caching. In Aoi's Twitter moderation system, Falcon1 likely generates the initial content. Incoming data for Aoi is processed by a Qwen Deepseek Distill 14B, and outgoing content is partially processed by Gemma 3 27B and a Qwen Deepseek 32B Distill to give ratings and aggregate responses.
- Falcon-7B: An older iteration of the model, noted for knowing several IRC log formats, a capability that Falcon3-7B seems to lack.
- Falcon3-7B: This version has been used in various tests, including evaluating its completion quality (which generally looks reasonable) and its ability to track characters in narratives. It has also been the subject of experiments regarding its resistance to quantization damage. There were also fine-tuning experiments planned for Falcon models, specifically using Lora on RefinedWeb.
Challenges and Observations Edit
- Bot Instability: A Discord bot utilizing a Falcon3 model was observed to break.
- Sanitized Outputs: The behavior of Falcon3-7B suggests that its training data may have been sanitized into artificial formats, impacting its ability to handle real-world data structures like IRC logs. This sanitization is believed to be a contributing factor to issues with its real IRC log format knowledge.
- Persona Consistency: Some users experienced difficulty with Falcon3-7B adapting its persona for specific scenarios like Twitch streams.