Abstract
The disclosed concepts relate to leveraging a language model to identify data health issues in a data set. One example method involves accessing a data set. The example method also involves, using an automated evaluation planning agent, inputting a prompt to generate a data evaluation plan for the data set to a generative language model, the prompt including context describing the data set. The example method also involves receiving the data evaluation plan generated by the generative language model and identifying one or more data health issues in the data set by performing the data evaluation plan using an automated evaluation plan execution agent.
Full Text
What is claimed is:
The disclosed concepts relate to leveraging a language model to identify data health issues in a data set. One example method involves accessing a data set. The example method also involves, using an automated evaluation planning agent, inputting a prompt to generate a data evaluation plan for the data set to a generative language model, the prompt including context describing the data set. The example method also involves receiving the data evaluation plan generated by the generative language model and identifying one or more data health issues in the data set by performing the data evaluation plan using an automated evaluation plan execution agent.
Timeline
Filed
02/20/2026Published
06/25/2026Granted
Not AvailableIPC Codes(1)
G06F 16/215:Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors