Welcome to WitDB!

What is WitDB?

WitDB is a distributed SQL query engine designed to query large data sets distributed over one or more heterogeneous data sources.

WitDB is a tool designed to efficiently query vast amounts of data using distributed queries. If you work with terabytes or petabytes of data, you are likely using tools that interact with Hadoop and HDFS. WitDB was designed as an alternative to tools that query HDFS using pipelines of MapReduce jobs, such as Hive or Pig, but WitDB is not limited to accessing HDFS. WitDB can be and has been extended to operate over different kinds of data sources, including traditional relational databases and other data sources such as Cassandra.

WitDB was designed to handle data warehousing and analytics: data analysis, aggregating large amounts of data and producing reports. These workloads are often classified as Online Analytical Processing (OLAP).

Getting Started

Use WitDB to analysis, aggregrate ...

→

Making Contributing

Check out our CONTRIBUTING guide ...

→

Core Features

Speed

A highly parallel and distributed query engine, that is built from the ground up for efficient, low latency analytics

Scale

The largest organizations in the world use Trino to query exabyte scale data lakes and massive data warehouses alike

Simplicity

WitDB is an ANSI SQL compliant query engine, that works with BI tools such as R, Tableau, Power BI, Superset ...

Versatile

Supports diverse use cases: ad-hoc analytics at interactive speeds, massive multi-hour batch queries, and high volume apps ...

In-place Analysis

You can natively query data in Hadoop, S3, Cassandra, MySQL, and many others, without the need for complex, slow, and error-prone ...

Query Federation

Access data from multiple systems within a single query. For example, join historic log data stored in an S3 object storage with customer data ...

NextQuickstart

Last updated 2 years ago