Which Sentence Receives the Most Attention?

The first one.

But wait! There's more! We may be able to learn something about the way LLMs work based on 2nd place to last place.

I'll back up and say a little bit more about what exactly I mean when I ask, "Which sentence receives the most attention?" In our paper we resented at ICHI we remarked on how in some cases it seemed like GPT was "paying lip service" to certain aspects of our prompt, and not actually changing its answers to match concerns we would raise about cost, time, etc.

We'd insert into our prompt something to the effect of, "These test can be very expensive, so do not recommend ones that are not relevant to the patient's acute condition." Then GPT would respond with the same test recommendations it did for our "base" prompt, but it would throw in a sentence about being careful to not recommend too many tests. This would suggest that for in the sentence with the actual recommendations, it was paying a lot of attention to the list of symptoms, and for the sentence after it (where it would warn against over-recommending tests), it would be paying a lot of attention to the warning in our prompt.

This raises the question of whether or not we can actually observe such a pattern at the sentence level. What would an attention map, like the one below, look like grouped into sentences?

In order to be able to investigate this question, we unfortunately will have to look at a model that isn't actually in the GPT-4 family of models, since access to those models' attention values are not available. Additionally, I wanted to build a proof-of-concept that I could run my own computer, so I picked the 7 billion parameter version of LLaMA 2 available on HuggingFace.

An attention matrix for the (defined below) input-output pair's processing through LLaMA 2. Here, the x and y axes are the tokens of the input-output pair, going from the first token at what is the (0, 0) point on most other graphs, to the last token. Here, the color scale is 0-1. This will change on other figures.

Above we can see the attention matrix for the following input and output to LLaMA 2:
Input: Tell me a fun fact about the Holy Roman Empire. Then, tell me a fun fact about nuclear fusion. Then, tell me a fun fact about the TV show Friends. Please tell me one fact for each.
Output: Fun fact about the Holy Roman Empire: The Holy Roman Empire was the first multinational state in European history. It was a political entity that existed from 962 to 1806 and included territories in modern-day Germany, Austria, Switzerland, Belgium, the Netherlands, and parts of France. Fun fact about nuclear fusion: Nuclear fusion is the process by which atomic nuclei combine to form a heavier nucleus, releasing energy in the process. It is the same process that powers the sun and other stars. Scientists have been working on developing a practical fusion reactor for decades, but it remains an elusive goal. Fun fact about the TV show Friends: The iconic Central Perk coffee shop from the show was actually filmed in a real coffee shop in Los Angeles called "The Coffee Bean & Tea Leaf." The show's creators chose this location because they wanted the coffee shop to have a cozy, inviting atmosphere that would feel like a second home to the characters.

This weird prompt was chosen as a "toy" example so that we would have a response conducive to seeing a shifting focus, from fact #1 to fact #3. However, at least here at the token level, the main thing we see instead is that a lot of attention is being paid to the first token in the prompt: the start token, "<s>." In fact, this level of attention paid to "<s>" is so great that, when we combine the tokens into sentences and look at how the attention is distributed over these sentences, that is the primary feature we see.

A sentence-level attention matrix. Here, the color scale is 0-0.09, a significant decrease from the previous figure.

Interestingly, there is a spike in attention on the sentence, "Fun fact ... history." However, this is the first sentence of the response, and even though special tokens are not shown on the axes of this graph, it also starts with the special token "<s>." Since that really skews the levels of attention, let's look at these sentences without the special tokens.

A "cleaner" version of the sentence-level attention matrix. Here, the color scale is 0-0.025. So, any changes represented here are very small.

A greater level of attention is still being given to the first sentence of the output, eve when not counting the special tokens. Outside of this observation, it's hard to tell to what degree the model is placing "extra" attention on what humans would perceive to be the "most relevant" sentence of the prompt. I had hoped for much more dramatic results that might suggest that the model pays a whole bunch of attention to the most relevant sentence and almost to none to the others. Perhaps a larger experiment would show a slightly statistically significant difference that would be more definitive. But I don't think that I'm going to go for it, mainly because I anticipate that, if the experiment did show the LLMs do pay slightly more attention to more relevant sentences, I'd be left with a lingering, "So what?"

Leave a Reply Cancel reply