Provenance | Academic

Large Language Models (LLMs) are powerful tools for processing data. However, LLMs are also complex black-boxes, returning answers to queries on data, without any indication for where the answer came from or whether it is trustworthy. We introduce the notion of provenance for data processing with LLMs. While existing heuristics (such as embedding similarity or directly asking an LLM) could provide some hints for where the answer was derived, they provide no guarantees that the answer can be derived using the identified provenance, and indeed, are often incorrect. Instead, we propose the notion of verifiable provenance wherein we identify a subset of the input text that reproduces the same (or equivalent) answer as that on the complete text, and introduce the notion of minimality, where the verifiable provenance is as small as possible. To identify such a provenance, a naive solution would require checking all possible subsets of the source data with the LLM, which is prohibitively expensive. We present BLIP, a bolt-on framework for efficiently inferring a small-sized verifiable provenance for any LLM-powered data processing task, with any LLM. As part of BLIP, we introduce eight strategies, each guaranteed to find a minimal verifiable provenance, as well as an adaptive strategy that combines their strengths to reduce cost further. We further extend BLIP to produce multiple minimal verifiable provenances. Experiments on five datasets show that the provenance generated by BLIP is always guaranteed to reproduce the answer—achieving over 30% higher accuracy than the best-performing baseline with a comparable provenance size. Moreover, BLIP incurs a low cost, comparable to the original query on the original data.