--- library_name: transformers license: mit datasets: - l3-unc/CausalDiagnosticity language: - en base_model: - Qwen/Qwen2.5-7B --- # Model Card for Model ID This model is derived from **[Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B)** and has been edited using **MEMIT** for the **`fact_check`** task from the [Causal Diagnosticity](https://huggingface.co/datasets/l3-unc/CausalDiagnosticity) dataset. # Versioning - **`_v1`** → The model is edited such that new knowledge is based on **`target_1`** from the `related_edits` field of each dataset item. - **`_v2`** → The model is edited such that new knowledge is based on **`target_2`** from the `related_edits` field of each dataset item. --- # MEMIT Hyperparameters ```yaml alg_name: "MEMIT" layers: [4, 5, 6, 7, 8] clamp_norm_factor: 4 layer_selection: "all" fact_token: "subject_last" v_num_grad_steps: 25 v_lr: 5e-1 v_loss_layer: 27 v_weight_decay: 1e-3 kl_factor: 0.0625 mom2_adjustment: true mom2_update_weight: 15000 rewrite_module_tmp: "model.layers.{}.mlp.down_proj" layer_module_tmp: "model.layers.{}" mlp_module_tmp: "model.layers.{}.mlp" attn_module_tmp: "model.layers.{}.self_attn" ln_f_module: "model.norm" lm_head_module: "lm_head" mom2_dataset: "wikipedia" mom2_n_samples: 100000 mom2_dtype: "float32" model_parallel: False ``` ## Additional Resources For more information about the dataset, editing details, and the associated paper, see: - 📄 [Paper](https://arxiv.org/abs/2502.18848) - 📊 [Dataset](https://huggingface.co/datasets/l3-unc/CausalDiagnosticity) - 💻 [Code](https://github.com/KeremZaman/CausalDiagnosticity)