This is the last part of three parts. The last time we have queried a CSV file in Hadoop HDFS from Drill. Now we will slightly change the query to create parquet files. Simply add the first line as shown below: Go to the console window and check what has happened in the Hadoop filesystem: The folder "airports_parquet" was created as we used it as the table name in the "create table as" statement. Inside the folder there are the parquet files - one per continent: Now that we have the parquet files created, we can use them to query the airport data. Here is an example: This is it. We have created parquet files from the CSV file. Parquet offers a better performance than CSV files and can easily be created from Drill.
0 Comments
Leave a Reply. |
AuthorUwe Geercken Categories
All
Archives
September 2020
|