Is there any library that can be used to write custom data files from a PHP app in ORC or Parquet format for Presto queries ?
If not what is the best practice in this case? Hopefully one that doesn’t involve setting up Map Reduce cluster.
10X
– Nir
Advertisement
Answer
There is the https://github.com/apache/parquet-cpp project that provides a C++ implementation to write Parquet files without any use of MapReduce or the JVM. While there are already Python (https://arrow.apache.org/docs/python/parquet.html), Ruby / GLib (https://github.com/red-data-tools/parquet-glib) and NodeJS (https://github.com/skale-me/node-parquet) bindings, there are none yet for PHP. But given those mentioned bindings, you should be able to write ones for PHP quite easily.