Querying

Why query Marple DB?

Querying Marple DB directly allows you to execute advanced use cases:

  • Train machine learning models

  • Calculate complex aggregates

  • Generate standardised reports

  • ... and much more

By running these types of analysis on Marple DB, they can be 10-100x faster than traditional scripting. In those cases, the extract-load-transform (ELT) often makes it unfeasible both in terms of run time and computational demand.

By querying Marple DB, you get more value from data that is already collected, cleaned, standarised and optimised for time series operations.

Hot storage (Postgres)

Request your database credentials from our support team.

Structure

The hot storage is organised into three schemas:

  1. _mdb contains internal bookkeeping. Touching this might cause malfunction

  2. mdb_data has the actual time series data of heated signals. Should also not be used directly for most use cases

  3. public contains three tables per datapool (called default in most cases)

    • _dataset with file and metadata info

    • _signal with signal info and metadata

    • _signal_enum

Querying

You can use any Postgres-compatible tool for sending queries.

For reading time series data, a premade function makes it a lot easier to get data:

-- getting data from datapool 'default'
SELECT * 
FROM mdb_default_data(
    'file.csv', -- or alternatively, use dataset_id
    'speed', -- or alternatively, usesignal_id 
    0, # start timestamp (nanoseconds)
    279112 # end timestamp (nanoseconds)
)

Cold storage (Parquet)

Request your blob storage credentials from our support team.

Structure

The cold storage follows a directory structure with Datapool > Dataset > Signal > Parquet file :

It contains no metadata, except for file names and signal names. The parquet files contain the raw time series data for one signal.

Querying

Querying can be done by downloading individual Parquet files, or by writing queries in Duck DB (or similar tools).

Last updated