Loading…
USENIX ATC '24 and OSDI '24
Attending this event?
Thursday July 11, 2024 2:40pm - 3:00pm PDT
Rui Wang, YScope; Devin Gibson, YScope and University of Toronto; Kirk Rodrigues, YScope; Yu Luo, YScope, Uber, and University of Toronto; Yun Zhang, Kaibo Wang, Yupeng Fu, and Ting Chen, Uber; Ding Yuan, YScope and University of Toronto

Internet-scale services can produce a large amount of logs. Such logs are increasingly appearing in semi-structured formats such as JSON. At Uber, the amount of semi-structured log data can exceed 10PB/day. It is prohibitively expensive to store and analyze them. As a result, logs are only kept searchable for a few days.

This paper proposes μSlope, a system that losslessly compresses semi-structured log data, and allows search without full decompression. It concisely represents the schema structures, and only keeps this representation stored once per dataset instead of interspersing it with each record. It further "structurizes" the semi-structured data by grouping the records with the same schema structure into the same table, so that each table is well structured. Our evaluation shows that μSlope achieves 21.9:1 to 186.8:1 compression ratio, which is at least a few times higher than any existing semi-structured data management systems (SSDMS); The compression ratio is even 2.34x as much as Zstandard and the search speed is on 5.77x of other SSDMSes.

https://www.usenix.org/conference/osdi24/presentation/wang-rui
Thursday July 11, 2024 2:40pm - 3:00pm PDT
Grand Ballroom ABGH

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link