Skip to content

Added Mergekit tasks#26

Open
ParamThakkar123 wants to merge 3 commits into
mainfrom
add/mergekit-task
Open

Added Mergekit tasks#26
ParamThakkar123 wants to merge 3 commits into
mainfrom
add/mergekit-task

Conversation

@ParamThakkar123
Copy link
Copy Markdown
Contributor

@ParamThakkar123 ParamThakkar123 commented Mar 27, 2026

Changes

  • Added mergekit-merge/ directory with merge functionality
  • merge.py: Script to merge two models using mergekit, reads params from task.yaml
  • merge_config.yml: Mergekit configuration file
  • task.yaml: Task definition with parameters (model1, model2, weights, etc.)
  • Fixed empty model names by adding defaults in merge.py for SmolLM models
  • Updated configs to use compatible SmolLM-135M and SmolLM-135M-Instruct models

Testing

Locally

  1. Install mergekit: cd mergekit && pip install -e .
  2. Run: python ../merge.py (requires lab framework) or mergekit-yaml ../merge_config.yml ./merged_model
  3. Verify merged model in ./merged_model

In App

  1. Select the mergekit-merge task
  2. Configure parameters in task.yaml if needed
  3. Run the task
  4. Check merged model output

@deep1401 deep1401 self-requested a review April 17, 2026 15:42
Copy link
Copy Markdown
Member

@deep1401 deep1401 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails for me with this error:

(task, pid=1479) {'status': 'error', 'error': 'Merge failed: `torch_dtype` is deprecated! Use `dtype` instead!\n\nWarmup loader cache:   0%|          | 0/2 [00:00<?, ?it/s]\n\nFetching 8 files:   0%|          | 0/8 [00:00<?, ?it/s]\x1b[A\n\nFetching 8 files:  25%|██▌       | 2/8 [00:00<00:00, 11.90it/s]\x1b[A\n\nFetching 8 files:  62%|██████▎   | 5/8 [00:03<00:02,  1.27it/s]\x1b[A\nFetching 8 files: 100%|██████████| 8/8 [00:03<00:00,  2.28it/s]\n\nWarmup loader cache:  50%|█████     | 1/2 [00:03<00:03,  3.69s/it]\n\nFetching 12 files:   0%|          | 0/12 [00:00<?, ?it/s]\x1b[A\n\nFetching 12 files:  25%|██▌       | 3/12 [00:04<00:14,  1.57s/it]\x1b[A\nFetching 12 files: 100%|██████████| 12/12 [00:04<00:00,  2.55it/s]\n\nWarmup loader cache: 100%|██████████| 2/2 [00:08<00:00,  4.38s/it]\nWarmup loader cache: 100%|██████████| 2/2 [00:08<00:00,  4.28s/it]\nTraceback (most recent call last):\n  File "/opt/conda/bin/mergekit-yaml", line 10, in <module>\n    sys.exit(main())\n  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1442, in __call__\n    return self.main(*args, **kwargs)\n  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1363, in main\n    rv = self.invoke(ctx)\n  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1226, in invoke\n    return ctx.invoke(self.callback, **ctx.params)\n  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 794, in invoke\n    return callback(*args, **kwargs)\n  File "/home/sky/mergekit/mergekit/options.py", line 166, in wrapper\n    return f(*args, **kwargs)\n  File "/home/sky/mergekit/mergekit/scripts/run_yaml.py", line 30, in main\n    run_merge(\n  File "/home/sky/mergekit/mergekit/merge.py", line 64, in run_merge\n    targets = MergePlanner(\n  File "/home/sky/mergekit/mergekit/plan.py", line 335, in plan_to_disk\n    self._plan()\n  File "/home/sky/mergekit/mergekit/plan.py", line 382, in _plan\n    self.plan_module(module_name, self.config.modules[module_name])\n  File "/home/sky/mergekit/mergekit/plan.py", line 303, in plan_module\n    module_arch = self._out_module_arch(module_name)\n  File "/home/sky/mergekit/mergekit/plan.py", line 77, in _out_module_arch\n    return ConfiguredModuleArchitecture(\n  File "/opt/conda/lib/python3.10/site-packages/pydantic/main.py", line 214, in __init__\n    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)\n  File "/opt/conda/lib/python3.10/site-packages/pydantic/_internal/_mock_val_ser.py", line 100, in __getattr__\n    raise PydanticUserError(self._error_message, code=self._code)\npydantic.errors.PydanticUserError: `ConfiguredModuleArchitecture` is not fully defined; you should define `torch`, then call `ConfiguredModuleArchitecture.model_rebuild()`.\n\nFor further information visit https://errors.pydantic.dev/2.10/u/class-not-fully-defined\n'}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants