Pneuma: LLM-Based Data Discovery System for Tabular Data
Pneuma is an LLM-powered data discovery system for tabular data. Given a natural language query,
Pneuma searches an indexed collection and retrieves the most relevant tables for the question. It performs this search by leveraging both content (columns and rows) and context (metadata) to match tables with questions.
Getting Started
If you would like to try Pneuma without installation, you can use our Colab notebook. For local installation, you may use an OpenAI API token or a local GPU with at least 20 GB of VRAM (to load and prompt both the LLM and embedding model).