View on GitHub

ImpalaToGo

Is a fork of Cloudera Impala, separated from Hadoop.

Quick Start

ImpalaToGo

ImpalaToGo is a fork of Cloudera Impala, separated from Hadoop. It is optimized to work with S3 storage by caching data locally.

Why Choose ImpalaToGo


  1. It is Impala without Hadoop. You can take advantage of its fast query engine without managing the whole Hadoop stack.
  2. It is optimized to work with S3. ImpalaToGo transparently caches data on local drives.
  3. It is actually the only open source MPP database written in C++.
  4. It gives you almost the same capabilities as Hive over S3, but is much faster.

What have we added to Cloudera Impala?

We have developed a caching layer which uses local drives to provide cache access for remote storage. Its advantages can easily be seen when working with S3.

JSON Support

Yes, we have it! The instruction here

How To Try

It's easy. Just follow the instruction in Quick Start guide

Have problems?

We have a Forum where we'll be glad to answer any question.
If you found a bug or want suggest an improvement, don't hessitate Create Issue.

Presentations

ImpalaToGo Use Case
ImpalaToGo design explained
ImpalaToGo Introduction

License

Apache License

2014 - 2015