I have recently started looking into querying large sets of CSV data lying on HDFS using Hive and Impala. As I was expecting, I get better response time with Impala compared... more
I have recently started looking into querying large sets of CSV data lying on HDFS using Hive and Impala. As I was expecting, I get better response time with Impala compared to Hive for the queries I have used so far.
I am wondering if there are some types of queries/use cases that still need Hive and where Impala is not a good fit.
How does Impala provide faster query response compared to Hive for the same data on HDFS?