Academic
Academic
Home
Experience
Publications
Projects
Contact
Light
Dark
Automatic
Query Processing
TWIX: Automatically Reconstructing Structured Data from Templatized Documents
Yiming Lin
,
Mawil Hasan
,
Rohan Kosalge
,
Alvin Cheung
,
Aditya G. Parameswaran
PDF
Towards Accurate and Efficient Document Analytics with Large Language Models
Yiming Lin
,
Madelon Hulsebos
,
Ruiying Ma
,
Shreya Shankar
,
Sepanta Zeighami
,
Aditya G. Parameswaran
,
Eugene Wu
PDF
PLAQUE: Automated Predicate Learning at Query Time
Yiming Lin
,
Sharad Mehrotra
PDF
ZIP: Lazy Imputation during Query Processing
Yiming Lin
,
Sharad Mehrotra
PDF
EnrichDB
EnrichDB is a system designed to support just-in-time data enrichment during query processing. EnrichDB is motivated by applications that consume (potentially large volumes of) raw data that must first be interpreted using expensive machine learning / signal processing functions prior to being queried/used in analysis. Executing such enrichment during data ingestion (to support real-time analytics) is challenging to scale specially when dataset can be very large and/or when data arrives at a high velocity. EnrichDB addresses this challenge by supporting enrichment at all phases of data processing including intermixing enrichment with query processing. It exploits query context to steer enrichment in ways such that the query results can be computed progressively. EnrichDB is implemented using a layered approach on top of PostgreSQL, though it can easily be layered on other databases.
Website
Cite
×