Row group size: Larger row groups allow for larger column chunks, which makes it possible to do larger sequential IO. Larger groups also require more buffering in the write path (or a two-pass write). We recommend large row groups (512MB - 1GB). Because an entire row group might need to be read, it should completely fit on one HDFS block. Therefore, HDFS block sizes should also be set to be larger. For example, an optimized read setup would be 1GB row groups, 1GB HDFS block size, and 1 HDFS block per HDFS file.