ImpalaToGo
ImpalaToGo is a fork of Cloudera Impala, separated from Hadoop. It is optimized to work with S3 storage by caching data locally.
Why Choose ImpalaToGo
- It is Impala without Hadoop. You can take advantage of its fast query engine without managing the whole Hadoop stack.
- It is optimized to work with S3. ImpalaToGo transparently caches data on local drives.
- It is actually the only open source MPP database written in C++.
- It gives you almost the same capabilities as Hive over S3, but is much faster.
What have we added to Cloudera Impala?
We have developed a caching layer which uses local drives to provide cache access for remote storage. Its advantages can easily be seen when working with S3.
JSON Support
Yes, we have it! The instruction here
How To Try
It's easy. Just follow the instruction in Quick Start guideHave problems?
We have a Forum where we'll be glad to answer any question.
If you found a bug or want suggest an improvement, don't hessitate Create Issue.
Presentations
ImpalaToGo Use Case
ImpalaToGo design explained
ImpalaToGo Introduction