Discussion about this post

User's avatar
Jason D’Cruz's avatar

I think you correctly point out that empathy expressed via the written word can be nourishing, significant, and genuine. There is nothing inherently inferior about it (which is not to deny that there are characteristic and significant advantages to the face-to-face embodied encounter). Are the loving epistles of Eloise and Abelard any less human or empathic because they are written down rather than spoken live? Of course not. Singling out empathy expressed in written form is a red herring.

That said, I think there are important shortcomings of the extant AI empathy studies that focus on written expressions. But Molly Crockett misdiagnoses the problem in the Guardian piece.

Here's my best attempt at articulating what I feel is missing. I'd love to hear your thoughts. The shortcoming of the studies is not that empathy expressed in written form is second-rate. The shortcoming of the studies is that it is very hard to determine whether empathy expressed in written form is real empathy. That is, it is very hard to tell whether such expressions of empathy issue from real feeling or not. The task, "respond empathically (in writing)" is a task that asks us to perform empathy rather than actually empathize. We can get very good at the performance without actually being good at empathizing at all. Consequently, when I compare two passages and rate them for how "empathic" they are, I'm really just assessing the virtuosity of a certain kind of performance. LLMs are truly virtuosic. I bet they would beat humans at the task of expressing empathy through a facial expression, too. [Here's a study idea: you get people to comparatively rate pictures of real human faces responding empathically to prompt and AI-generate images of faces responding empathically to a prompt.]

I imagine LLMs would outperform similarly for tasks like, "respond passive aggressively to this letter" or "respond in a way that expresses bitter envy" or "respond in a way that expresses incredulity". Would we conclude that performance on such tasks show that AI surpasses humans in the ability to respond with passive aggressiveness or outrage or incredulity? Would we even be troubled or fascinated with the capacity of LLMs to do well at such tasks? It certainly wouldn't be fascination of the kind that induces AI vertigo. So why should we be troubled and fascinated by the capacity of LLMs to produce empathic sounding written passages in response to prompts? The positive assessment of AI-generated expressions of empathy may have little or nothing to do with assessing whether whatever generated that performance is empathizing well, poorly, or not at all.

I think that the fact that people sincerely report feeling seen and heard by their Repliika boyfriends and girlfriends is pretty interesting. I would go so far as to say it's troubling and fascinating. These are extended, reciprocal, and richly contextualized encounters. These are the kind of contexts where it becomes hard to fake empathy, and where we care about more than a particular kind of virtuosic performance. But my sense is that the subset of humans who experience being seen and heard in such contexts is a small minority. Most people won't conflate the simulacrum of empathy with the real thing, and they won't feel seen and heard when they know the entity in question has no experience of seeing or hearing. That's my hunch, at least. But maybe as things progress we'll start to lose our grip. My intuition is that this would be a lamentable eventuality. I wonder what you think.

Expand full comment
Dunigan's avatar

Really enjoyed this! Wanted to share a few of my thoughts:

First, is the telephone really the right analogy for AI? Phones, video chat, the printing press -- these are pieces of tech that mediate human communication without replacing it. Yet, almost all the use cases right now for AI in the loneliness/empathy space seem to be focused on replacing human interactions. This seems categorically different than most of the technology you highlighted.

I also totally agree with your point about the importance of lab experiments, and the AI studies you mentioned have been important proof of concepts for our ability to benefit from or connect with AI in ways we might not have expected. But, the vast majority of these studies are focused on asynchronous, or extremely brief interactions, and only look at the immediate emotional impact of the interactions with AI - meaning that we actually know very little about the long-term consequences of AI empathy/companionship. I know you mentioned this, but it seems like an absolutely critical asterisk around the conclusions we are making about AI right now. If we were studying the impact of potato chips, and we only focused on the immediate term impact, we might conclude that potato chips are perfectly fine meal replacement since people seem to be pretty happy every time they eat one, but this would mask serious long-term consequences we'd be totally blind to.

I also think it's impossible to divorce research on the impact of empathic AI from concerns about these companies motives. Unfortunately, these companies are not driven by concerns about our well-being, and the vast majority of people will be interacting with AI created by these conglomerates. There are a number of decisions that Replika has made for example, that seem optimized for engagement rather than wellbeing. Unfortunately, this means that any given study on AI companions will be divorced from the real-world, given that it's unlikely any given company will be designing these things in the exact same way a well-intentioned researcher would.

Thanks for the thoughtful post!

Expand full comment
6 more comments...

No posts