Phil: "There's lots of fancy file for…"

Recent searches

Search options

Only available when logged in.

There's lots of fancy file formats to choose from when building a #datalake but we still went with gzipped JSON. Why? Because we prioritize moving data into purpose-built systems rather than querying it directly. This basic shift in approach has made a world of difference! Here's a thing I wrote about that:

https://opendatascience.com/choosing-a-data-lake-format-what-to-actually-look-for/

Open Data Science - Your News Source for AI, Machine Learning & moreChoosing a Data Lake Format: What to Actually Look ForRecently we’ve seen lots of posts about a variety of different file formats for data lakes. There’s Delta Lake, Hudi, Iceberg, and QBeast, to name a few. It can be tough to keep track of all these data lake formats — let alone figure out why (or if!) we really...

Jul 25, 2023, 03:55 PM··Web

0boosts·0favorites

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back