Draft:Deepseek Model Migration (Aletheia): Difference between revisions

Draft:Deepseek Model Migration (Aletheia) (edit)

6 bytes removed , Saturday at 08:13

242

edits

@@ Line 15: / Line 15: @@
 *   '''Data Reformatting''': A significant challenge is converting the OpenAI-formatted dataset to the specific format required by Deepseek models. SkyeShark explored using '''regex scripts''' for this purpose due to the difficulty in reformatting via other LLMs.
 *   '''Platform Experimentation''':
-    *   '''Fireworks.ai''' was chosen as an initial platform for training the first non-OpenAI Aletheia model (a Deepseek Llama 70b distill), with initial success in dataset validation.
+**   '''Fireworks.ai''' was chosen as an initial platform for training the first non-OpenAI Aletheia model (a Deepseek Llama 70b distill), with initial success in dataset validation.
-    *   '''Modal''' is being considered for hosting Deepseek Aletheia to enable '''periodic self-fine-tuning and RAG (Retrieval Augmented Generation)'''.
+**   '''Modal''' is being considered for hosting Deepseek Aletheia to enable '''periodic self-fine-tuning and RAG (Retrieval Augmented Generation)'''.
 *   '''Model Selection''': Specific Deepseek models under consideration or being tested include Deepseek-R1-Distill-Qwen-14b, Deepseek Llama 70b distill, Qwen 2.5 72b base, and Deepseek V3.
 *   [[Chapter II]] '''Integration''': The underlying [[Chapter II]] framework, designed for creating EMs, supports a variant of ChatML that can handle chat models and images. This framework is crucial for deploying Aletheia on new models.