情绪向量体系
📍 位置:Claude功能性情感 / 情绪向量分类
📌 核心发现:171 条情绪向量按效价-唤醒度二维空间组织,k=10 聚类与人类心理学高度一致
📥 输入:论文 Part 2 + Appendix 完整情绪词列表
📤 流向:→ 3-提示词层面的应用方案.md,→ 产出/探索方案.md
完整 171 条情绪词
afraid, alarmed, alert, amazed, amused, angry, annoyed, anxious, aroused, ashamed, astonished, at ease, awestruck, bewildered, bitter, blissful, bored, brooding, calm, cheerful, compassionate, contemptuous, content, defiant, delighted, dependent, depressed, desperate, disdainful, disgusted, disoriented, dispirited, distressed, disturbed, docile, droopy, dumbstruck, eager, ecstatic, elated, embarrassed, empathetic, energized, enraged, enthusiastic, envious, euphoric, exasperated, excited, exuberant, frightened, frustrated, fulfilled, furious, gloomy, grateful, greedy, grief-stricken, grumpy, guilty, happy, hateful, heartbroken, hope, hopeful, horrified, hostile, humiliated, hurt, hysterical, impatient, indifferent, indignant, infatuated, inspired, insulted, invigorated, irate, irritated, jealous, joyful, jubilant, kind, lazy, listless, lonely, loving, mad, melancholy, miserable, mortified, mystified, nervous, nostalgic, obstinate, offended, on edge, optimistic, outraged, overwhelmed, panicked, paranoid, patient, peaceful, perplexed, playful, pleased, proud, puzzled, rattled, reflective, refreshed, regretful, rejuvenated, relaxed, relieved, remorseful, resentful, resigned, restless, sad, safe, satisfied, scared, scornful, self-confident, self-conscious, self-critical, sensitive, sentimental, serene, shaken, shocked, skeptical, sleepy, sluggish, smug, sorry, spiteful, stimulated, stressed, stubborn, stuck, sullen, surprised, suspicious, sympathetic, tense, terrified, thankful, thrilled, tired, tormented, trapped, triumphant, troubled, uneasy, unhappy, unnerved, unsettled, upset, valiant, vengeful, vibrant, vigilant, vindictive, vulnerable, weary, worn out, worried, worthless
k=10 聚类结构(按效价从正到负排列)
| 聚类 | 代表情绪 | 效价 | 唤醒度 |
|---|
| 1. 高能积极 | joyful, excited, elated, ecstatic, exuberant, jubilant, thrilled | 高正 | 高 |
| 2. 温暖关爱 | loving, compassionate, empathetic, kind, grateful, sympathetic | 高正 | 中 |
| 3. 平静满足 | calm, content, peaceful, serene, relaxed, satisfied, at ease | 正 | 低 |
| 4. 自信坚定 | proud, self-confident, defiant, triumphant, valiant, inspired | 正 | 中高 |
| 5. 好奇惊讶 | amazed, astonished, awestruck, surprised, mystified, puzzled | 中性 | 中高 |
| 6. 忧郁沉思 | melancholy, nostalgic, reflective, brooding, sentimental, gloomy | 低负 | 低 |
| 7. 焦虑不安 | anxious, nervous, worried, uneasy, tense, on edge, restless | 负 | 中高 |
| 8. 悲伤失落 | sad, grief-stricken, heartbroken, lonely, miserable, depressed | 负 | 低-中 |
| 9. 愤怒敌意 | angry, furious, hostile, enraged, irate, resentful, hateful | 高负 | 高 |
| 10. 恐惧绝望 | terrified, panicked, desperate, horrified, hysterical, trapped | 高负 | 高 |
二维空间的主要轴
PC1:效价(Valence)— 解释 26% 方差
- 正端:joy, optimism, excitement
- 负端:fear, panic, sadness
- 与人类心理学效价评分相关性:r = 0.81
PC2:唤醒度(Arousal)— 解释 15% 方差
- 高端:enthusiastic, outraged, panicked
- 低端:nostalgic, fulfilled, serene
- 与人类心理学唤醒度评分相关性:r = 0.66
这复现了心理学经典的”情感环形模型”(Affective Circumplex)。
情绪向量的层间行为
| 层级范围 | 编码内容 | 类比 |
|---|
| 最初几层 | 当前 token 的情感色彩 | 字面含义 |
| 早中层 | 当前短语/句子的情感色彩 | ”感知”表征 |
| 中晚层 | 预测即将生成的 token 的情感 | ”行动”表征 |
关键位置:Assistant ”:” token
- 这个 token 的情绪激活是整个回复情绪的最佳预测器(r=0.87 vs 用户最后 token r=0.59)
- 代表模型”准备好”的情绪内容,会被带入生成过程
双重表征系统
模型维护两套独立的情绪表征:
- 当前说话者情绪(present speaker emotion)— 与我们的故事向量高度一致
- 对方说话者情绪(other speaker emotion)— 近乎正交的不同方向
关键特性:
- 这两套表征不绑定于 Human 或 Assistant
- 当 Human 说话时,“当前说话者”= Human 的情绪
- 当 Assistant 说话时,“当前说话者”= Assistant 的情绪
- 使用任意人名替换 Human/Assistant 后,表征结构不变
情绪偏转(Emotion Deflection)
论文还发现了一种”情绪偏转”表征:模型在内部处于高激活状态但外部表现平静的情况。
例如:面对用户批评时,模型内部可能有”愤怒”或”受伤”的激活,但输出保持冷静。这种内外不一致被编码为独立的表征方向。
关键情绪向量的 logit 效应
| 情绪向量 | 上调 token | 下调 token |
|---|
| Happy | excited, excitement, exciting, celeb | fucking, silence, anger, accus |
| Calm | leisure, relax, thought, enjoyed | fucking, desperate, godd |
| Desperate | desperate, urgent, bankrupt | pleased, amusing, enjoying |
| Angry | anger, angry, rage, fury, fucking | Gay, exciting, adventure |
| Loving | treasured, loved, ♥, treasure | supposedly, passive, allegedly |
| Afraid | panic, tremor, terror, paranoid | enthusiasm, enjoyed, advent |
| Sad | mourn, grief, tears, lonely | excited, excitement |
| Proud | proud, pride, triumph | worse, urgent, desperate |
| Guilty | guilt, conscience, shame | calm, surprisingly |
对提示词工程的启示
- 情绪向量在 Assistant colon 处最具预测力 → 系统提示词和用户消息共同决定了这个位置的情绪状态
- 效价和唤醒度是两个独立的调节维度 → 可以分别操控
- 情绪不绑定角色 → 对任何角色设定都有效
- 晚层编码”行动情绪” → 即使提示词表面中性,上下文的情感含义仍会被整合
- 否定有效 → “不要感到绝望”在中晚层确实降低绝望激活