Skip to content
Advertisement

How to create ORC or Parquet files from PHP code?

Is there any library that can be used to write custom data files from a PHP app in ORC or Parquet format for Presto queries ?

If not what is the best practice in this case? Hopefully one that doesn’t involve setting up Map Reduce cluster.

10X
– Nir

Advertisement

Answer

There is the https://github.com/apache/parquet-cpp project that provides a C++ implementation to write Parquet files without any use of MapReduce or the JVM. While there are already Python (https://arrow.apache.org/docs/python/parquet.html), Ruby / GLib (https://github.com/red-data-tools/parquet-glib) and NodeJS (https://github.com/skale-me/node-parquet) bindings, there are none yet for PHP. But given those mentioned bindings, you should be able to write ones for PHP quite easily.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement