• About
    • About the FIA
    • Priorities
    • Our Team
    • Brainstorming Board
    • Partners and Affiliates
    • Contact Us
  • News + Events
    • News
    • Events
    • Videos
    • Newsletters
    • In the Media
  • Spark Grants
    • Spark Grants Overview
    • Spark Grants FAQ
    • 2012-2015 Seed Grants
    • 2012-2015 Seed Grant Winners
  • Special Topics
    • SearchReSearch
    • Curated Topics
FIA

SearchReSearch

Answer: How can we use LLMs to search better?

Dan Russell • July 19, 2023
 SearchReSearch
Republished with permission from SearchReSearch
Answer: How can we use LLMs to search better? Dan Russell

Magic is, by definition...


Precision targeting for SearchResearch. P/C by Mikhail Nilov (Pexels link)


... something that you don't understand. That's why magicians wow us in their presentations.

I love magic, but I really don't want magical interfaces to magical systems.


Still, as we saw in our post about Using LLMs to find Amazing Words..., with a little ingenuity, we can do remarkable things. (Definition of LLM)

In that post, I illustrate how to use LLMs to find words that end in -core that describe an aesthetic style. The clever thing that LLMs did in that Challenge was to find related words that do NOT have the -core ending with a similar aesthetic meaning. (Example: "dark academia")

Our Challenge this week came in two parts:

1. Can you find a way to use LLMs (ChatGPT, Bard, Claude, etc.) to answer research questions that would otherwise be difficult to answer? (As with the Using LLMs to find Amazing Words... example. If you find such a research task, be sure to let us know what the task is, the LLM you used, and what you did to make it work.)

As we've seen before, LLMs are currently not great at providing accurate answers. So we're trying to figure out ways to use LLMs in productive ways.

In last week's commentary post (on Friday), I showed a way to search for keywords and phrases to use for regular Google search. I've used that method a few times since then, and it's always worked out well.

Short summary: Don't ask your LLM for specific answers to questions, and REALLY don't ask for citations. At the moment, LLMs are all too happy to make up fake citations.

But you SHOULD ask for other terms and phrases you should be searching for in addition to what you've been searching. The sample prompt pattern I used was:

[what are the N most common subtopics related to TOPIC?]

To use this, just replace the highlighted N with a number (typically 5 - 20), and replace the highlighted TOPIC with a short text description of your topic-to-search-for.

This is a great way to figure out how to expand your range of ideas.
2. Here's an example of this difficult to answer "regular search" task: I wanted to make a list of all the SRS Challenges and Answers (the C&A list) since the beginning of this year. I used an LLM to help me figure out the process. Can you figure out what I did? (I'll tell you now that I learned a bunch in doing this, and it only took me about 10 minutes from start-to-finish. I count that as a major win.)

Big hint from last week was this..

I broke this task down into 3 steps:

1. get the list of C&As from the blog into a text file
2. extract out the Challenges and Answers (getting rid of anything extra)
3. then reverse the order of the C&A list

Let me unpack this.

1. To get a list of ALL the blog posts, I opened the most recent blog post and scrolled to the bottom. (There are other ways to do this.) It looks like this:



Then, I just opened all of the twisty triangles for each of the entries for this year to see each of the blog posts. That listing looks like this:



Then, I just selected all of that text, hit COPY, and then opened a text editor and paste the text (unformatted) into a .TXT document. (I used BBedit, which is a robust and useful editor, but you can use whatever editor you'd like.)

In this pic, I've highlighted a bunch of lines that are neither Challenges or Answers. We don't want those lines in our final C&A answer.



Yes, I could manually delete each line, but what if I want to do 1000 lines? For that process, I asked Bard:


There are a couple more answers below, but I realized that grep was exactly what I wanted. It's a command line action that will find lines that match a pattern and extract them. Perfect!

Except that there's a small problem--the code snippet shown here doesn't quite work. It says that I should do:

grep -E "Answer|SearchResearch" bbedit.txt

After playing around for a while, I figured out that the correct expression should be:

grep 'Answer\|Challenge' bbedit.txt

The double-quotes don't work, one needs to use single quotes. And then you need to use the \ character to tell grep that the | character means OR (in classical Boolean logic).

Then, once you run that, you've got the list of C&As. Excellent. But they're in the wrong order! Dang!

Now I want to reverse the order of the lines in the file. That's a classic programming problem, but I don't want to fool around--I just want to flip the order so that the last line becomes the first line, etc.

Back to Bard:



This is great! I didn't know about the tac command, so I've learned something new.

But when I try to do:
tac bbedit.txt

my MacOS terminal application says that it's not part of the terminal commands. (A useful thing to know: the Linux that ships on the MacOS isn't a "full" distribution--a lot of commands, like tac, are missing.)

Time to turn to regular Google and search for:

[ tac in MacOS ]

which points me to a StackExchange page with an even better answer, one that doesn't require tac:


So the right next step was to do:

tail -r bbedit.txt
which then gave me the file listing of SRS posts BACKWARDS. That is, only Challenges and Answers from Jan 1, 2023 - July 5, 2023. As you can see, each Challenge is followed by its correct answer:

SearchResearch Challenge (1/4/23): How can I find latest updates on topics of interest?

Answer: How can I find latest updates on topics of interest?

SearchResearch Challenge (1/18/23): Musicians travels--how did they get

Answer: Musicians travels--how did they get from A to B?

SearchResearch Challenge (2/8/23): What do you call this thing?

Answer: What do you call this thing?

SearchResearch Challenge (2/22/23): World's largest waterfall?

Answer: World's largest waterfall?

SearchResearch Challenge (3/8/23): What do these everyday symbols mean?

Answer: What do these everyday symbols mean?

SearchResearch Challenge (3/22/23): What do you call the sediment that

Answer: What do you call the sediment that blocks a river flowing to

SearchResearch Challenge (4/5/23): What's this architecture all about?

Answer: What's this architecture all about?

SearchResearch Challenge (4/19/23): How well do LLMs answer SRS

Answer: How well do LLMs answer SRS questions?

SearchResearch Challenge (5/31/23): Did they really burn ancient Roman

Answer: Did they really burn Roman statues?

SearchResearch Challenge (6/14/23): How to find the best AI-powered

Answer: How to find the best AI-powered search engine of the moment?

SearchResearch Challenge (6/28/23): How can you find a free audio book?

Answer: How can you find a free audio book?



SearchResearch Lessons

There are lessons here, and surely more to come as we learn more about working with LLMs.

1. Ask your LLM to help brainstorm ideas for search terms that you might not have thought about. If you think of the LLM as a reasonably accurate brainstorming partner, you might it generates some good Google you wouldn't have thought about.

2. Ask your LLM about ways to transform your data. I've found that it often will suggest things that I once knew, but forgot about (e.g., using the Linux command tail to reverse the order of lines in a file). Sometime soon I'll write about other ways I've used an LLM to help me clean and restructure data files. In truth, this is my primary use case for LLMs these days--as a research assistant to fix up and analyze data.

3. Be aware that the details of what your LLM tells you might need a little tweaking. In the above example, I had to tweak that grep expression to use single quotes rather than double quotes. Often what the LLM tells you is in the right ballpark, but not precisely correct. (Think of it as a slightly unreliable narrator!)




Keep Searching!


Share

Comments

This post was republished. Comments can be viewed and shared via the original site.
1 comment

About the Author

Dan RussellDan Russell

I study the way people search and research. I guess that makes me an anthropologist of search. I am FIA's Future-ist in Residence. More »

Recent News

  • Deepfakes and the Future of Facts
    Deepfakes and the Future of FactsSeptember 27, 2019
  • Book cover for Joy of Search by Daniel M. Russell
    The Joy of Search: A Google Insider’s Guide to Going Beyond the BasicsSeptember 26, 2019
  • The Future of Facts in a ‘Post-Truth’ World
    The Future of Facts in a ‘Post-Truth’ WorldMay 15, 2018
  • The Future of Virtual and Augmented Reality and Immersive Storytelling
    The Future of Virtual and Augmented Reality and Immersive StorytellingJune 6, 2017

More »

Upcoming Events

There are no upcoming events scheduled. Please check back later.
Event Archive »
Video Archive »

Join Email List

SearchReSearch

  • SearchResearch Challenge (9/20/23): What IS dietary fiber?
    SearchResearch Challenge (9/20/23): What IS dietary fiber?September 20, 2023
  • Answer: A mysterious octopus? And the woman who understood.
    Answer: A mysterious octopus? And the woman who understood.September 15, 2023
  • SearchResearch Challenge (9/6/23): A mysterious octopus? And the woman who understood.
    SearchResearch Challenge (9/6/23): A mysterious octopus? And the woman who understood.September 6, 2023
  • Answer: What’s that in the belly of the Redondasaurus?
    Answer: What’s that in the belly of the Redondasaurus?August 30, 2023

More »

University of Maryland logo
Robert W. Deutsch Foundation logo
Google logo
Barrie School
Library of Congress logo
State of Maryland logo
National Archives logo
National Geographic Society logo
National Park Service logo
Newseum logo
Sesame Workshop logo
Smithsonian logo
WAMU
© 2023 The Future of Information Alliance, University of Maryland | Privacy Policy | Web Accessibility