Large Language Models can Accurately Predict Searcher Preferences

doi:10.1145/3626772.3657707

Paper

Large Language Models can Accurately Predict Searcher Preferences

Published Sep 19, 2023 · Paul Thomas, S. Spielman, Nick Craswell

Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

152

Citations

23

Influential Citations

PDF

Abstract

Much of the evaluation and tuning of a search system relies on relevance labels---annotations that say whether a document is useful for a given search and searcher. Ideally these come from real searchers, but it is hard to collect this data at scale, so typical experiments rely on third-party labellers who may or may not produce accurate annotations. Label quality is managed with ongoing auditing, training, and monitoring. We discuss an alternative approach. We take careful feedback from real searchers and use this to select a large language model (LLM), and prompt, that agrees with this feedback; the LLM can then produce labels at scale. Our experiments show LLMs are as accurate as human labellers and as useful for finding the best systems and hardest queries. LLM performance varies with prompt features, but also varies unpredictably with simple paraphrases. This unpredictability reinforces the need for high-quality "gold" labels.

Highly Cited

Study Snapshot

Large language models (LLMs) can accurately predict searcher preferences and help find the best systems and hardest queries, but performance varies with prompt features and simple paraphrases.

PopulationOlder adults (50-71 years)

Sample size24

MethodsObservational

OutcomesBody Mass Index projections

ResultsSocial networks mitigate obesity in older groups.

Large Language Models can Accurately Predict Searcher Preferences

References

Citations