There's lots of fancy file formats to choose from when building a #datalake but we still went with gzipped JSON. Why? Because we prioritize moving data into purpose-built systems rather than querying it directly. This basic shift in approach has made a world of difference! Here's a thing I wrote about that: