JAlcocerTech E-books

SQL Ecosystem & Tools

Trino (formerly Presto)

Trino is an open-source, distributed SQL query engine for running interactive queries across multiple data sources.

Key Features:

  • Query data where it lives (S3, HDFS, Kafka, MongoDB, etc.)
  • Federated queries across diverse sources
  • Parallel processing for speed
  • User-Defined Functions (UDFs) in Java
SHOW CATALOGS;
SHOW TABLES FROM catalog.schema;
SELECT * FROM "catalog.schema.table";

Connectors: Kafka, MariaDB, Google Sheets, MongoDB, DRUID, Prometheus, HDFS, S3, GCS

Clients: Redash, Superset, Metabase, Grafana, Python, R

DuckDB

An embedded OLAP database optimized for analytical queries.

Great for local data analysis.

docker run -d -p 8888:8888 gethue/hue:latest

UIs for DuckDB:

  • Huey — Web UI for DuckDB
  • Hue — SQL Assistant for databases & data warehouses

3. Useful Tools for Database Exploration

  • ChartDB: A visual database diagram editor (ERDs from a single query).
  • DuckDB: An in-process OLAP database (the “SQLite for Analytics”).
  • DBeaver / Beekeeper Studio: Universal database managers with excellent GUIs.
  • Hue: An open-source SQL assistant for Big Data clusters (Hadoop/Hive).