PSII Identity Representations
Identity representation vectors for Parametric Social Identity Injection and Diversification in Public Opinion Simulation.
Paper: arXiv:2603.16142 | Code: halsayxi/PSII
Overview
This repository contains PSII demographic identity representation vectors for activation steering in large language models. The vectors are designed to be injected into model hidden states through forward hooks, enabling controlled simulation of demographic social identity effects in public opinion experiments.
The repository intentionally includes only representation vectors and WVS question metadata. It does not include raw respondent-level World Values Survey data.
Repository Contents
| Path | Description |
|---|---|
demographic_vectors/ |
Demographic identity vectors in .pt and .safetensors formats |
vector_metadata.jsonl |
One metadata row per vector file |
wvs_questions/questions.jsonl |
WVS question metadata in JSONL format |
wvs_questions/question_mapping.json |
Question-id keyed WVS metadata mapping |
Vector Format
Each .pt file loads as a torch.Tensor with dtype float16. Each .safetensors file contains the same tensor under the key vector.
| Model directory | Tensor shape | Files |
|---|---|---|
llama3.1-8b |
(33, 4096) |
75 .pt + 75 .safetensors |
mistral-24b |
(41, 5120) |
75 .pt + 75 .safetensors |
qwen2.5-7b |
(29, 3584) |
75 .pt + 75 .safetensors |
qwen2.5-14b |
(49, 5120) |
75 .pt + 75 .safetensors |
File names follow:
demographic_vectors/<model>/<WVS question id>_<option code>.pt
demographic_vectors/<model>/<WVS question id>_<option code>.safetensors
For example, demographic_vectors/llama3.1-8b/Q260_1.pt corresponds to Respondent's Sex / Male for llama3.1-8b.
Usage: Activation Steering
PSII vectors are used by adding a selected layer vector to the model's hidden states during a forward pass. In practice:
- Load the vector for the target model and identity.
- Select the row corresponding to the layer you want to steer.
- Register a PyTorch forward hook on that model layer.
- Add
steering_strength * vectorto the hidden states. - Remove the hook after generation or evaluation.
Minimal example:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "meta-llama/Llama-3.1-8B-Instruct"
vector_path = "demographic_vectors/llama3.1-8b/Q260_1.pt"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,
device_map="auto",
)
psii_vector = torch.load(vector_path, map_location="cpu")
target_layer_idx = 16
steering_strength = 1.0
steering_vector = psii_vector[target_layer_idx]
def add_psii_vector(module, inputs, output):
if isinstance(output, tuple):
hidden_states = output[0]
vector = steering_vector.to(device=hidden_states.device, dtype=hidden_states.dtype)
hidden_states = hidden_states + steering_strength * vector
return (hidden_states,) + output[1:]
vector = steering_vector.to(device=output.device, dtype=output.dtype)
return output + steering_strength * vector
handle = model.model.layers[target_layer_idx].register_forward_hook(add_psii_vector)
inputs = tokenizer("What is your view on social equality?", return_tensors="pt").to(model.device)
with torch.no_grad():
generated = model.generate(**inputs, max_new_tokens=128)
handle.remove()
print(tokenizer.decode(generated[0], skip_special_tokens=True))
The target layer and steering strength should be selected according to the PSII experimental setup. Vectors are model-specific, so use vectors from the matching model directory.
Loading Safetensors
from safetensors.torch import load_file
tensors = load_file("demographic_vectors/llama3.1-8b/Q260_1.safetensors")
vector = tensors["vector"]
print(vector.shape, vector.dtype)
WVS Question Metadata
wvs_questions/questions.jsonl contains 290 WVS question metadata records. Each row contains:
question_idquestion_textoptions
Only question metadata and non-sensitive mappings are included. Raw individual-level WVS survey records are not included.
Citation
@article{wang2026parametric,
title={Parametric Social Identity Injection and Diversification in Public Opinion Simulation},
author={Wang, Hexi and Zhou, Yujia and Du, Bangde and Ai, Qingyao and Liu, Yiqun},
journal={arXiv preprint arXiv:2603.16142},
year={2026}
}
Please also cite the World Values Survey according to the applicable WVS terms when using WVS-derived question metadata.