streamable chunked parquet using arrow
Details
Parquet files are streamed to disk by breaking them into chunks that are
equal to the nlines parameter in the initial call to ark. For each tablename, a
folder is created and the chunks are placed in the folder in the form part-000000.parquet.
The software looks at the folder, and increments the name appropriately for the next
chunk. This is done intentionally so that users can take advantage of arrow::open_dataset
in the future, when coming back to review or perform analysis of these data.