Skip to content

modulabs-personalab/psyctl

Repository files navigation

PSYCTL Logo

PSYCTL - LLM Personality Steering Tool

English | 한국어

⚠️ Project Under Development This project is currently under development and only supports limited functionality. Please check the release notes for stable features.

A project by Persona Lab at ModuLabs.

A tool that supports steering LLMs to exhibit specific personalities. The goal is to automatically generate datasets and work with just a model and personality specification.


📚 Documentation

Core Guides

Additional Resources


📖 User Guide

🚀 Quick Start

Installation

Basic Installation (CPU Version)

# Install uv (Windows)
Invoke-WebRequest -Uri "https://astral.sh/uv/install.ps1" -OutFile "install_uv.ps1"
& .\install_uv.ps1

# Project setup
uv venv
& .\.venv\Scripts\Activate.ps1
uv sync

Installation in Google Colab

# Install directly from GitHub
!pip install git+https://github.com/modulabs-personalab/psyctl.git

# Or install from specific branch
!pip install git+https://github.com/modulabs-personalab/psyctl.git@main

# Set environment variables
import os
os.environ['HF_TOKEN'] = 'your_huggingface_token_here'
os.environ['PSYCTL_LOG_LEVEL'] = 'INFO'

# Usage example
from psyctl import DatasetBuilder, P2, LLMLoader

GPU Acceleration Installation (CUDA Support)

# Install CUDA-enabled PyTorch after basic installation
uv pip install torch --index-url https://download.pytorch.org/whl/cu121

# Verify installation
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"

Important: The transformers package has torch as a dependency, so running uv sync will automatically install the CPU version. For GPU usage, you need to run the CUDA installation command above again.

Basic Usage

# 1. Generate dataset
psyctl dataset.build.steer \
  --model "google/gemma-3-27b-it" \
  --personality "Extroversion, Machiavellism" \
  --output "./dataset/steering"

# 2. Upload dataset to HuggingFace Hub (optional)
psyctl dataset.upload \
  --dataset-file "./dataset/steering/steering_dataset_*.jsonl" \
  --repo-id "username/extroversion-steering"

# 3. Extract steering vector
psyctl extract.steering \
  --model "meta-llama/Llama-3.2-3B-Instruct" \
  --layer "model.layers[13].mlp.down_proj" \
  --dataset "./dataset/steering" \
  --output "./steering_vector/out.safetensors"

# 4. Steering experiment
psyctl steering \
  --model "meta-llama/Llama-3.2-3B-Instruct" \
  --steering-vector "./steering_vector/out.safetensors" \
  --input-text "Tell me about yourself"

# 5. Inventory test
psyctl benchmark inventory \
  --model "meta-llama/Llama-3.2-3B-Instruct" \
  --steering-vector "./steering_vector/out.safetensors" \
  --inventory "ipip_neo_120" \
  --trait "Neuroticism"

📋 Commands Overview

PSYCTL provides 5 main commands. See documentation links above for detailed usage.

Command Description Documentation
dataset.build.steer Generate steering datasets Guide
dataset.upload Upload datasets to HuggingFace Guide
extract.steering Extract steering vectors Guide
steering Apply steering to generation Guide
benchmark inventory Test with psychological inventories (logit-based) See below
benchmark llm-as-judge Test with LLM as Judge (situation-based questions) See below
inventory.list List available inventories See below

Benchmark Methods:

  • Inventory: Uses standardized psychological inventories (e.g., IPIP-NEO) with logit-based scoring. More objective and reproducible.
  • LLM as Judge: Generates situation-based questions and uses an LLM to evaluate responses. More flexible and context-aware.
    • For API-based judges (OpenAI, OpenRouter), set environment variables:
      • OPENAI_API_KEY for OpenAI models
      • OPENROUTER_API_KEY for OpenRouter models
    • For local models, use local-default (reuses target model) or configure custom model path in benchmark_config.json
    • For custom API servers, edit benchmark_config.json to add your server configuration

📊 Supported Inventories

Inventory Domain License Notes
IPIP-NEO-300/120 Big Five Public Domain Full & short forms
NPI-40 Narcissism Free research use Forced-choice
PNI-52 Pathological narcissism CC-BY-SA Likert 1–6
NARQ-18 Admiration & Rivalry CC-BY-NC Two sub-scales
MACH-IV Machiavellianism Public Domain Likert 1–5
LSRP-26 Psychopathy Public Domain Primary & secondary
PPI-56 Psychopathy Free research use Short form

⚙️ Configuration

PSYCTL uses environment variables for configuration. Required:

# Get your token from https://huggingface.co/settings/tokens
export HF_TOKEN="your_huggingface_token_here"  # Linux/macOS
$env:HF_TOKEN = "your_token_here"              # Windows

For detailed configuration options (directories, performance tuning, logging), see Configuration Guide.

📝 Complete Workflow Example

# 1. Generate dataset for extroversion personality
# Set batch size for optimal performance
export PSYCTL_INFERENCE_BATCH_SIZE="16"

psyctl dataset.build.steer \
  --model "meta-llama/Llama-3.2-3B-Instruct" \
  --personality "Extroversion" \
  --output "./dataset/extroversion" \
  --limit-samples 1000

# 2. Extract steering vector
psyctl extract.steering \
  --model "meta-llama/Llama-3.2-3B-Instruct" \
  --layer "model.layers[13].mlp.down_proj" \
  --dataset "./dataset/extroversion" \
  --output "./steering_vector/extroversion.safetensors"

# 3. Apply steering to generate text
psyctl steering \
  --model "meta-llama/Llama-3.2-3B-Instruct" \
  --steering-vector "./steering_vector/extroversion.safetensors" \
  --input-text "Tell me about yourself"

# 4. Measure personality changes with inventory
psyctl benchmark inventory \
  --model "meta-llama/Llama-3.2-3B-Instruct" \
  --steering-vector "./steering_vector/extroversion.safetensors" \
  --inventory "ipip_neo_120" \
  --trait "Extraversion"

# 5. Measure personality changes with LLM as Judge
# Note: For API-based judges, set environment variables:
#   export OPENAI_API_KEY="your-key"        # For OpenAI models
#   export OPENROUTER_API_KEY="your-key"    # For OpenRouter models
# Or configure custom API server in benchmark_config.json
psyctl benchmark llm-as-judge \
  --model "meta-llama/Llama-3.2-3B-Instruct" \
  --steering-vector "./steering_vector/extroversion.safetensors" \
  --trait "Extraversion" \
  --judge-model "local-default" \
  --num-questions 10 \
  --strengths "1.0,2.0,3.0"

Notebooks (Google Colab):

Open any notebook directly in Colab — no local setup required:

English:

Notebook Description Time
01_quickstart Instant personality steering with pre-trained vectors ~5 min
02_measure_personality Measure personality with IPIP-NEO-120 inventory ~8 min
03_generate_dataset Generate your own steering dataset ~5 min
04_extract_vector Extract vectors with mean_diff, denoised, and BiPO ~10 min
05_layer_analysis Find optimal steering layers with SVM analysis ~10 min
06_benchmark_vectors Benchmark vectors with IPIP-NEO-120 inventory ~15 min

Korean (한국어):

노트북 설명 소요 시간
01_quickstart 사전학습 벡터로 성격 즉시 조향 ~5분
02_measure_personality IPIP-NEO-120 심리 검사로 성격 측정 ~8분
03_generate_dataset 나만의 스티어링 데이터셋 생성 ~5분
04_extract_vector mean_diff, denoised, BiPO 벡터 추출 ~10분
05_layer_analysis SVM 분석으로 최적 스티어링 레이어 탐색 ~10분
06_benchmark_vectors IPIP-NEO-120 인벤토리 벤치마크 ~15분

Note: These notebooks require a HuggingFace token (set as HF_TOKEN in Colab secrets). GPU runtime (T4 or higher) is recommended.


🤝 Contributing

Contributions are welcome! See Contributing Guide for:

  • Development environment setup
  • Code style and standards
  • Testing guidelines
  • Pull request process

Key papers


Sponsors

Caveduck.io

About

No description, website, or topics provided.

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors