Resampling

In many features of Marple Insight, multiple signals must be combined: in functions, scatter plots, tables, filters, ... When these signals have different time bases, it becomes hard to know which datapoints from signal A correspond to which datapoints from signal B.

Some tools don't allow you to combine signals with different timebases. In Marple Insight, this is solved by resampling the signals.

Settings component to specify how to resample

When multiple signals must be combined, you have the option to select how it's done:

  • off don't resample, only use the original timestamps.

  • auto resample using a dynamic frequency that depends on the underlying signals & database connection type

    • Marple DB, Postgres & TImescale: use the maximum frequency of the selected signals, upsample the signals with a lower frequency.

    • Other database connections: use the minimal frequency of the selected signals.

  • on resample using a selected frequency.

Resampling off

When using off, all signals are joined on their exact timestamp. If all signals have datapoints with exactly matching timestamps (up to the nanosecond), the resulting signal will also have a datapoint at these timestamps. Datapoints with a timestamp that does not occur in all original signals are ignored in the resulting signal.

Resampling off with signals that have timestamps that match exactly
Resampling off with signals that have timestamps that slightly differ

Benefits:

  • Exact timestamps are preserved

  • Faster than on or auto

Resampling auto

auto is essentially the same as on but with a dynamic selection of the frequency. See Resampling on for more details.

Which frequency is used depends on whether the database connection implements upsampling.

  • Marple DB, PostgreSQL and Timescale support upsampling. The maximum of the underlying frequencies is used.

  • Other database connections don't support upsampling (yet). The minimum of the underlying frequencies is used.

Resampling on

When using on (or auto), the input signals are sampled at the selected frequency to build the output signal.

To do this, the time range is divided in buckets (bins) of width 1/frequency seconds. Within each bucket, the first datapoint is selected and gets a new timestamp assigned: the start time of the corresponding bucket. This ensures all signals have matching timestamps and we know which datapoints to combine in the resulting signal.

Because data is aggregated per bucket, timestamps can slightly shift (at most 1 period).

The exact algorithm works slightly different depending on your connection.

Upsampling implemented

For the most used database connection types: Marple DB, PostgreSQL and Timescale, the algorithm first upsamples the signals to a common time axis by forward filling missing data to a common time axis with the specified frequency. The upsampled signals with the common times axis are then combined into the resulting signal as explained above. The highest frequency is selected when using auto .

Using forward fill ensures no data is invented and integer signals remain integers.

The figures below explain what happens depending on the selected frequency

Upsampling at frequency of slowest signal
Upsampling at intermediate frequency
Upsampling at frequency of fastest signal
Upsampling at frequency higher than fastest signal

Upsampling not implemented

For some lesser used database connections, upsampling is not yet implemented and the resampling as described above uses the raw signal data. For these connections auto will automatically use the lowest frequency. Setting a higher frequency will not result in more data.

Resampling at frequency of slowest signal
Resampling at frequency of fastest signal

Selecting the right frequency

In most cases it's best to let the auto setting determine the optimal frequency. In the following cases it makes sense to deviate from this setting:

  • All your data has the exact same timestamp (e.g. it comes from csv files or from the same group within an MDF file) -> It's safe to use off

  • You have a very complex plot with many functions and/or filters which takes a while to load and you're interested in a quick insight rather than the exact data -> decrease the suggested frequency, you might some data but the plot will load faster

Last updated