• About
    • About the FIA
    • Priorities
    • Our Team
    • Brainstorming Board
    • Partners and Affiliates
    • Contact Us
  • News + Events
    • News
    • Events
    • Videos
    • Newsletters
    • @FIAumd
    • In the Media
  • Spark Grants
    • Spark Grants Overview
    • Spark Grants FAQ
    • 2012-2015 Seed Grants
    • 2012-2015 Seed Grant Winners
  • Special Topics
    • SearchReSearch
    • Curated Topics
FIA

SearchReSearch

Answer: How well do LLMs answer SRS questions?

Dan Russell • May 3, 2023
 SearchReSearch
Republished with permission from SearchReSearch
Answer: How well do LLMs answer SRS questions? Dan Russell

Remember this?

P/C Dall-e. Prompt: happy robots answering questions rendered in a
ukiyo-e style on a sweeping landscape cheerful


Our Challenge was this:

1. I'd like you to report on YOUR experiences in trying to get ChatGPT or Bard (or whichever LLM you'd like to use) to answer your curious questions. What was the question you were trying to answer? How well did it turn out?

Hope you had a chance to read my comments from the previous week.

On April 21 I wrote about why LLMs are all cybernetic mansplaining--and I mean that in the most negative way possible. If mansplaining is a kind of condescending explanation about something the man has incomplete knowledge (and with the mistaken assumption that he knows more about it than the person he's talking to does), then that's what's going on, cyberneticly.

On April 23 I wrote another post about how LLMs seem to know things, but when you question them closely, they don't actually know much at all.

Fred/Krossbow made the excellent point that it's not clear that Bard is learning. After asking a question, then asking a follow-up and getting a changed response: "Bard corrected the response. What I now wonder: will Bard keep that correction if I ask later today? Will Bard give the same response to someone else?"

It's unclear. I'm sure this kind of memory (and gradual learning) will become part of the LLMs. But at the moment, it's not happening.

And that's a big part of the problem with LLMs: We just don't know what they're doing, why, or how.

As several people have pointed out, that's true of humans as well. I have no idea what you (my dear reader) are capable of doing, whether you're learning or not... but I have decades of experience dealing with other humans of your make and model, and I far a pretty good idea about what a human's performance characteristics are. I don't have anything similar for an LLM. Even if I spent a lot of time developing one, it might well change tomorrow when a new model is pushed out to the servers. Which LLM are you talking to now?

P/C Dall-E. Prompt: [ twenty robots, all slightly different from each other, trying to answer questions in a hyperrealistic style 3d rendering ]

What happens when the fundamental LLM question-answering system changes moment by moment?

Of course, that's what happens with Google's index. It's varying all the time as well, and it's why you sometimes get different answers to the same query from day-to-day--the underlying data has changed.

And perhaps we'll get used to the constant evolution of our tools. It's an interesting perspective to have.

mateojose1 wonders if LLMs are complemented by deep knowledge components (e.g., grafting Wolfram Alpha to handle the heavy math chores), if THEN we'll get citations.

I think that's part of the goal. I've been playing around with Scite.ai LLM for the scholarly literature (think of it as ChatGPT trained on the contents of Google Scholar). It's been working really well for me when I ask it questions that are "reasonably scholarly," that is, with papers that might address the question at hand. I've been impressed with the quality of the answers, along with the lack of hallucination AND the presence of accurate citations.

This LLM (scite.ai) is so interesting that I'd devote an entire post to it soon. (Note that I'm not getting any funding from them to talk about their service. I've just been impressed.)

As usual, remmij has a plethora of interesting links for us to consider. You have to love remmij's "robots throwing an LLM into space" Dall-E images. Wonderful. (Worth a click.)

But I also really agree with the link that points to Beren Millidge's blog post about how LLMs "confabulate not hallucinate."

This is a great point--the term "hallucination" really means that one experiences an apparent sensory perception of something not actually present. At the same time "confabulation" happens when someone is not able to explain or answer questions correctly, but does so anyway. The confabulator (that's a real word, BTW) literally doesn't know if what they're saying is true or not, but does ahead regardless. That's much more like what's going on with LLMs.


Thanks to everyone for their thoughts. It's been fun to read them the past week. Sorry about the delay. I was at a conference in Hamburg, Germany. As usual, I thought I would have the time to post my reply, but instead I was completely absorbed in what was happening. As you can imagine, we all spent a lot of time chatting about LLMs and how humans would understand them and grow to use them.

The consensus was that we're just at the beginning of the LLMs arms race--all of the things we worry about (truth, credibility, accuracy, etc.) are being challenged in new and slightly askew ways.

I feel like one of the essential messages of SearchResearch has always been that we need to understand what our tools are and how they operate. The ChatGPTs and LLMs of the world are clearly new tools with great possibilities--and we still need to understand them and their limits.

We'll do our best, here in the little SRS shop on the prairie.

Keep searching, my friends.



Share

Comments

This post was republished. Comments can be viewed and shared via the original site.
8 comments

About the Author

Dan RussellDan Russell

I study the way people search and research. I guess that makes me an anthropologist of search. I am FIA's Future-ist in Residence. More »

Recent News

  • Deepfakes and the Future of Facts
    Deepfakes and the Future of FactsSeptember 27, 2019
  • Book cover for Joy of Search by Daniel M. Russell
    The Joy of Search: A Google Insider’s Guide to Going Beyond the BasicsSeptember 26, 2019
  • The Future of Facts in a ‘Post-Truth’ World
    The Future of Facts in a ‘Post-Truth’ WorldMay 15, 2018
  • The Future of Virtual and Augmented Reality and Immersive Storytelling
    The Future of Virtual and Augmented Reality and Immersive StorytellingJune 6, 2017

More »

Upcoming Events

There are no upcoming events scheduled. Please check back later.
Event Archive »
Video Archive »

Join Email List

SearchReSearch

  • Answer: Did they really burn Roman statues?
    Answer: Did they really burn Roman statues?June 8, 2023
  • SearchResearch Challenge (5/31/23): Did they really burn ancient Roman statues?
    SearchResearch Challenge (5/31/23): Did they really burn ancient Roman statues?May 31, 2023
  • SearchResearch on a podcast–"The Informed Life"–Listen now
    SearchResearch on a podcast–"The Informed Life"–Listen nowMay 22, 2023
  • Taking a bit of a break…
    Taking a bit of a break…May 10, 2023

More »

University of Maryland logo
Robert W. Deutsch Foundation logo
Google logo
Barrie School
Library of Congress logo
State of Maryland logo
National Archives logo
National Geographic Society logo
National Park Service logo
Newseum logo
Sesame Workshop logo
Smithsonian logo
WAMU
© 2023 The Future of Information Alliance, University of Maryland | Privacy Policy | Web Accessibility