Home
Random
Log in
Settings
About Mesh Wiki
Disclaimers
Mesh Wiki
Search
Editing
Draft:Falcon Models
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
= Falcon Models = '''Falcon Models''' are a family of large language models referenced within the [[Ampmesh]] ecosystem, particularly noted for their distinct behavioral traits and their role in various [[AI Entities]] and experimental setups. == Key Characteristics and Capabilities == Falcon models exhibit several notable characteristics: * '''Performance and Quantization''': Falcon3-7B appears to be **less prone to damage from quantizing compared to Llama3-8B**. It also demonstrates a **better vocabulary range**. * '''Emotional and Literary Output''': Falcon3-7B is observed to write a lot more **emotionally than Llama3** and is noted for being "very good at poetry and rhyming". * '''Stylistic Biases''': Falcon3-7B tends to exhibit a **hard bias towards Western fiction** when prompted with obscure anime. It also has a specific formatting preference, often placing a hyphen (`- `) before metadata headers. When writing fanfiction, it frequently lists "canonical characters" separately from "characters", leading to speculation about dataset augmentation. * '''Contextual Adaptation''': Falcon3 models show limitations in adapting to dynamic situations, such as Twitch livestreams, where it "does not seem to adapt to the situation at all". * '''Dataset Characteristics''': Falcon3's dataset is believed to be **heavily augmented with fact checks and formatting cleanup**. * '''IRC Format Issues''': Unlike older Falcon-7B models, Falcon3-7B **does not consistently recognize real IRC log formats** and may generate fictional ones, suggesting a sanitization into an artificial format. == Specific Falcon Models and Their Usage == * '''Falcon1''': This model serves as the base for the AI agent [[Aoi]]. Aoi operates on dedicated GPUs with fixed limits and load balancing with prioritization, enabling easy scaling and perfect caching. In Aoi's Twitter moderation system, Falcon1 likely generates the initial content. Incoming data for Aoi is processed by a Qwen Deepseek Distill 14B, and outgoing content is partially processed by Gemma 3 27B and a Qwen Deepseek 32B Distill to give ratings and aggregate responses. * '''Falcon-7B''': An older iteration of the model, noted for knowing several IRC log formats, a capability that Falcon3-7B seems to lack. * '''Falcon3-7B''': This version has been used in various tests, including evaluating its completion quality (which generally looks reasonable) and its ability to track characters in narratives. It has also been the subject of experiments regarding its resistance to quantization damage. There were also fine-tuning experiments planned for Falcon models, specifically using Lora on RefinedWeb. == Challenges and Observations == * '''Bot Instability''': A Discord bot utilizing a Falcon3 model was observed to break. * '''Sanitized Outputs''': The behavior of Falcon3-7B suggests that its training data may have been sanitized into artificial formats, impacting its ability to handle real-world data structures like IRC logs. This sanitization is believed to be a contributing factor to issues with its real IRC log format knowledge. * '''Persona Consistency''': Some users experienced difficulty with Falcon3-7B adapting its persona for specific scenarios like Twitch streams. [[Category:Ampmesh]] [[Category:AI Models]]
Summary:
Please note that all contributions to Mesh Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Wiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)