Aggregate
General overview
Groups together multiple data records into one single record. Grouping is based on timestamps. Different aggregation rules and functions are used for different data parameters. Rules can be customized by the user. The result of the aggregation procedure is written in the new output dataset. Typical usage of the tool is when you want to reduce e.g., 5-minute data into an hourly dataset. Another example is the time step harmonization of different datasets before their comparison.
# | Component | Description |
1 | Dataset | Input dataset to aggregate. |
2 | Table view | Opens table viewer. |
3 | Aggregate to | Output time step. |
4 | Edit rules | Opens a separate window for modification of aggregation rules. See the 'Edit aggregation rules' section. |
5 | Required number of original records | Minimum required number of original records (in %) for summarizing the output interval. If this condition is not met (e.g., there are less than 30 1-minute records in hourly output interval, while we require 50% minimum), the output interval is empty - filled by NaN value. |
6 | Create empty records... | Empty records created during the aggregation process will either be removed or a part of the output dataset. |
7 | Aggregate period from, to | Restrict the period of aggregation procedure. By default the time range fits the input dataset. |
8 | Name of the output dataset | Output dataset name. |
Database | Database for storing the output dataset. | |
9 | Aggregate | Runs the aggregation |
Edit aggregation rules
Dialog window for editing aggregation rules specifically for every single column. Aggregation rule can be a simple summarization function (SUM, MAX, MIN, MEAN, MEDIAN, MODE, FIRST, LAST) or a more specific method (MEAN_WEIGHTED, ANGULAR) applied by the aggregation procedure. Different rules can be set for aggregating into hourly and N-minute intervals and for daily, monthly or yearly.
Opting for the 'None' rule in combo boxes (hourly and lower, daily and higher) will result in having empty aggregated intervals filled by Nan values.
# | Component | Description |
---|---|---|
1 | Hourly and lower | Aggregation rule for hourly and sub-hourly aggregation (5, 10, 15, 20, 30 minutes). |
2 | Daily and higher | Aggregation rule for higher aggregation (daily, monthly, yearly). |
3 | Filter by flag values
| 'Use flags' enables filtering. By default the filtering is not active. Only visible for flagged data columns. |
'Flag filter' shows enter comma-separated flag values to be aggregated (the rest will be filtered). | ||
'Existing flags' shows existing flag values for the given column. | ||
4 | Set default rules | Re-sets the rules to the most typical setup. |
5 | Time zone adjustment... | Before summarizing into days, months or years, records are adjusted by local time zone. The reason is to collect all data belonging to the local day. |
Average aggregation for Albedo
Method of the albedo (ALB) aggregation depends on the presence of Global horizontal irradiation (GHI) in the dataset. If GHI is available:
Analyst calculates RHI (Reflected horizontal irradiation) from GHI and albedo (by the formula RHI = GHI / ALB)
mean is then calculated from calculated RHI
ALB is recalculated from aggregated RHI (by the formula ALB = RHI / GHI)
If GHI is not available:
simple mean aggregation of ALB
Angular aggregation
For data parameters with circular quantities measured in degrees e.g., wind direction or sun azimuth, it isn't appropriate to calculate arithmetic means. The arithmetic mean of two values of the wind direction measured in geographical azimuth 1° (northern wind) and 359° (northern wind) is 180° (southern wind), which is misleading. The angular method converts all angles from polar coordinates to Cartesian coordinates and then computes arithmetic mean on these values. Results are again transposed back to angles.
When aggregating wind direction values (Analyst parameter code is WD), Analyst will always use wind speed data (if the column exists) for giving weight when averaging wind directions. This approach prefers wind directions with higher intensity of winds.
Average weighted (Mean weighted) aggregation
Standard MEAN aggregation function works only with records within the edges of desired output interval. Record exactly positioned on the right edge of given interval falls into the next interval (see the picture 'Standard MEAN aggregation rule'). Typical measured value of irradiation rather represents the time interval, not the instant moment in time.
For example record at 07:00 reads 227.7 W/m2. This value represents the average of all readings between 06:52:30 and 07:07:30. If you use standard mean rule with data like this, your aggregation result will not represent mean irradiation between 06:00 and 07:00. Instead, your result will represent mean irradiation between 05:52:30 and 06:52:30.
MEAN_WEIGHTED aggregation takes into consideration the situation around interval borders. At the beginning of the process, weighted MEAN aggregation function densifies original records (see the picture 'MEAN_WEIGHTED aggregation rule'). The rule makes ten new records of the same value from one original record - giving the weight of each value by its occurrence. Then, the arithmetic mean is calculated.