Edit data

Calculator

The Analyst calculator is a powerful tool for data editing and calculation. The calculator includes tools for column creating and deleting, calculation tools with common functions and data selection:

  • Create new columns.
  • Calculate and write values into columns. Choose a column for calculation, insert expression and click on "Calculate" button. Values in the column will be rewritten by new values.  
  • Apply calculations on selected records only. Select the record in the table view, open context menu (right-click) and choose "Add to selection". Write the expression into the box for selection and press "Apply selection" button.
  • Explore columns. Click on the column in the list and press "Describe column" button.
  • Check your results before saving. Choose table view and explore table records

  • Save changes. Click on "Save" button


You can drag a column from the column list and drop it into one of the expression boxes. Column expression will be correctly transformed into processing form. This behavior is implemented in unique values list too. Predefined functions are possible to drag and drop only into the calculation expression box.


#

Component

Description

1

List of columns

List of columns available for expression construction. For inserting to the expression you can drag the list item and drop into one of the expression boxes or just double-click on the name and the column will be added to the expression with the cursor in the right form.

2


New

Opens the dialog for creating a new column in the dataset.

Delete

Removes selected column from dataset

3

Expression box for calculation

Insert expression for calculation. Calculation of values will be applied on selected records only or on all records. Below the expression box are buttons with operands.

4


Expression buttons

Click on the buttons to insert symbols in the expression box

Clear

Clears expression field

5

Functions

List of predefined functions which are possible to use in expression boxes

6

Table view

The table view shows selected records - highlighted by yellow color. Table view includes a context menu with functions for manual selection and unselection of records.

7

Clear selection

Clears selected records

8

Calculate

Launch calculation

Useful calculation expressions

Below there are some samples of useful calculation expressions for datasets.

Type

Description

Expression

Missing solar irradiance components

GHI - Global horizontal irradiance

col["DNI"] * sin(radians(sun_elevation)) + col["DIF"]

DNI - Direct normal irradiance

(col["GHI"] - col["DIF"]) / sin(radians(sun_elevation))

DIF - Diffuse horizontal irradiance

col["GHI"] - col["DNI"] * sin(radians(sun_elevation))

Kt - Clearness index

col["GHI"] / ETR

Ktb

col["DNI"] / ETR

D2G - Diffuse to global ratio

col["DIF"] / col["GHI"]

Unit conversions

Temperature Celzius to Fahrenheit

col["TEMP"] * 1.8 + 32

Temperature Fahrenheit to Celzius

(col["TEMP"] - 32)*(5/9)

Irradiation kWh/m2 to kJ/m2

col["DNI"] * 3600

Irradiation kJ/m2 to kWh/m2

col["DNI"] / 3600

Datetime selection

From a specific date (year)

col["datetime"] > "2018"

From a specific date (year)

col["datetime"] > "2017-10"

From a specific date (year)

col["datetime"] > "2017-10-30"

From a specific date (year)

col["datetime"] > "2017-09-30 10:35:00"

Exact date (year)*

col["datetime"] == "2017-09-30"

Excludes date*

col["datetime"] != "2017-09-30"

Combination using logical operators

(col["datetime"] > "2017-09-30 10:35:00") And (col["datetime"] < "2017-09-30 12:15:00")

Combination using logical operators

(col["datetime"] > "2017-09-30") And (col["datetime"] < "2017-10-01")

Combination using logical operators

(col["datetime"] > "10:00") And (col["datetime"] < "15:00")

Column selection

Combination using logical operators

(( col["GHI"] + col["DNI"] ) > 1200) And (col["TEMP"] < 30.2)

Combination using logical operators

(( col["GHI"] + col["DNI"] ) > 1200) And (Not( col["TEMP"] > 30.2))

Mixed selection

Combination using logical operators

(Not((col["datetime"] > "09:00") And (col["datetime"] < "15:00"))) Or (col["GHI"] > 800 ) And (col["DNI"] < 500)

*operations == and != support only date value, not datetime or time value

Copy a column

Example: Copy the column DNI from our dataset.

  1. Open Calculator
  2. Push "New" button under column list
  3. Fill "Name" (eg. new_DNI), parameter type (DNI) and press "OK". For filling in other optional information you have to use Metadata editor.
  4. Choose the new column in combobox (eg. col["new_DNI"])
  5. Fill expression box (eg. col["old_DNI]).
  6. Press "Validate" button
  7. If validate step was without errors press "Calculate" button
  8. For permanent saving of the results press "Save" button


Select records marked with a specific flag

Example: Select only those GHI values whose flag value is 2.

  1. Open Calculator
  2.  Double-click on column (eg. flg_GHI)
  3.  Type "==" or click on "==" button
  4.  Type value 2. Final expression will be: col["flg_GHI"] == 2
  5. Press "Validate" button
  6. If validate step was without errors press "Apply" button


Calculate parameters

Example: Diffuse horizontal irradiance (DIF) data was missing in the original imported file. 

  1. Open Calculator
  2. Push "New" button under column list
  3. Fill "Name" (DIF), parameter type (DIF) and press "OK". For filling other optional information you have to use Metadata editor.
  4. In expression combo-box choose column for calculation (col["DIF"])
  5. Type in calculation formula into expression box (col["GHI"] - col["DNI"] * sin(radians(sun_elevation)))
  6. Press "Calculate" for launching calculation.
  7. Press "Save" for saving your calculation into your dataset.

Calculate parameters (only if values are empty)

Example: Calculate values for DIF, only for records with empty values:

  1. Open Calculator
  2. Push "Select data" button
  3. Select DIF in column list and press "Describe column"
  4. Double-click on DIF column (will bring column into expression box), press "==" button and double-click on "NaN" value in unique value list. Final expression looks like this: col["DIF"] == NaN
  5. Press "Apply" and close window after the message informs you about the selection. Now you have selected all records with missing DIF value
  6. In expression combo-box choose column for calculation (col["DIF"])
  7. Type in calculation formula into expression box (col["GHI"] - col["DNI"] * sin(radians(sun_elevation)))
  8. Press "Calculate" for launching calculation. Calculation is applied only for selected records
  9. Press "Save" for saving your calculation into your dataset.


Units convertor

Converts between preselected types of data parameter units. You can do unit conversions in the Calculator tool as well. This tool is just more handy as it takes care of metadata. Conversion affects data values directly in the selected dataset without creating a copy.


#

Component

Description

1

Column

Column name for which the units will be converted.

2

Target units

Select desired units. Only supported unit conversions are provided.

If the original unit cannot be converted, the Convert button is disabled.

3

Table view

Opens the preview of data.

4ConvertRuns the unit conversion.

Aggregate

Groups together multiple data records into one single record. Grouping is based on timestamps. Different aggregation rules and functions are used for different data parameters. Rules can be customized by the user. The result of the aggregation procedure is written in the new output dataset. Typical usage of the tool is when you want to reduce e.g., 5-minute data into an hourly dataset. Another example is the time step harmonization of different datasets before their comparison. 

#

Component

Description

1

Dataset

Input dataset to aggregate.

2

Table view

Opens table viewer.

3

Aggregate to

Output time step.

4

Edit rules

Opens a separate window for modification of aggregation rules. See the 'Edit aggregation rules' section.

5

Required number of original records

Minimum required number of original records (in %) for summarizing the output interval. If this condition is not met (e.g., there are less than 30 1-minute records in hourly output interval, while we require 50% minimum), the output interval is empty - filled by NaN value.

6

Create empty records...

Empty records created during the aggregation process will either be removed or a part of the output dataset.

7

Aggregate period from, to

Restrict the period of aggregation procedure. By default the time range fits the input dataset.

8

Name of the output dataset

Output dataset name.

Database

Database for storing the output dataset.

9AggregateRuns the aggregation

Edit aggregation rules

Dialog window for editing aggregation rules specifically for every single column. Aggregation rule can be a simple summarization function (SUM, MAX, MIN, MEAN, MEDIAN, MODE, FIRST, LAST) or a more specific method (MEAN_WEIGHTED, ANGULAR) applied by the aggregation procedure.  Different rules can be set for aggregating into hourly and N-minute intervals and for daily, monthly or yearly. 

Opting for the 'None' rule in combo boxes (hourly and lower, daily and higher) will result in having empty aggregated intervals filled by Nan values.

#

Component

Description

1

Hourly and lower

Aggregation rule for hourly and sub-hourly aggregation (5, 10, 15, 20, 30 minutes).

2

Daily and higher

Aggregation rule for higher aggregation (daily, monthly, yearly).

3

Filter by flag values



'Use flags' enables filtering. By default the filtering is not active. Only visible for flagged data columns.

'Flag filter' shows enter comma-separated flag values to be aggregated (the rest will be filtered).

'Existing flags' shows existing flag values for the given column. 

4

Set default rules

Re-sets the rules to the most typical setup.

5

Time zone adjustment...

Before summarizing into days, months or years, records are adjusted by local time zone. The reason is to collect all data belonging to the local day.


Average aggregation for Albedo

Method of the albedo (ALB) aggregation depends on the presence of Global horizontal irradiation (GHI) in the dataset. If GHI is available:

  • Analyst calculates RHI (Reflected horizontal irradiation) from GHI and albedo (by the formula RHI = GHI / ALB)
  • mean is then calculated from calculated RHI
  • ALB is recalculated from aggregated RHI (by the formula ALB = RHI / GHI)
  • If GHI is not available:
  • simple mean aggregation of ALB

Angular aggregation

For data parameters with circular quantities measured in degrees e.g., wind direction or sun azimuth, it isn't appropriate to calculate arithmetic means. The arithmetic mean of two values of the wind direction measured in geographical azimuth 1° (northern wind) and 359° (northern wind) is 180° (southern wind), which is misleading. The angular method converts all angles from polar coordinates to Cartesian coordinates and then computes arithmetic mean on these values. Results are again transposed back to angles.

When aggregating wind direction values (Analyst parameter code is WD), Analyst will always use wind speed data (if the column exists) for giving weight when averaging wind directions. This approach prefers wind directions with higher intensity of winds.

Average weighted (Mean weighted) aggregation

Standard MEAN aggregation function works only with records within the edges of desired output interval. Record exactly positioned on the right edge of given interval falls into the next interval (see the picture 'Standard MEAN aggregation rule'). Typical measured value of irradiation rather represents the time interval, not the instant moment in time.

For example record at 07:00 reads 227.7 W/m2. This value represents the average of all readings between 06:52:30 and 07:07:30. If you use standard mean rule with data like this, your aggregation result will not represent mean irradiation between 06:00 and 07:00. Instead, your result will represent mean irradiation between 05:52:30 and 06:52:30.  

MEAN_WEIGHTED aggregation takes into consideration the situation around interval borders. At the beginning of the process, weighted MEAN aggregation function densifies original records (see the picture 'MEAN_WEIGHTED aggregation rule'). The rule makes ten new records of the same value from one original record - giving the weight of each value by its occurrence. Then, the arithmetic mean is calculated.


Join

Adds columns of values from one dataset to another dataset for all matching records. Matching is based on timestamps. As a result of this tool, new output dataset is created in the selected database.

Select the first dataset, then the second dataset, that you want for joining columns from. The order of datasets is important! The output dataset will contain the full copy of the first dataset data (no loss) plus all column values from the second dataset (for matching records). If a column with the same name exists in both datasets, the column values in the first dataset will not be overwritten. Instead, such column will be renamed and added as a new column.

The number of rows after the join operation is always the same as in the first dataset (left table). If there is no single matching timestamp between the datasets, the tool will exit rather than create the exact copy of the first dataset. Typically, after the join operation, you will visit the Metadata editor to properly describe newly added columns or drop unnecessary columns. For adding both rows and columns, please use the Combine tool.



#

Component

Description

1

First dataset

First dataset - the one having all data in the output dataset

2

Second dataset

Second dataset - the dataset which columns will be joined. Only column values for matching records will appear in the output dataset. Conflicting column names will be resolved by adding suffixes.

3Keep rows from both datasetsRecords from the second dataset will be append to the first dataset in case of timestamp mismatch.
4

Name of the output dataset

Output dataset name. The name is suggested to avoid duplicate names in the selected database.

5

Database

Database for storing the output dataset.

6JoinJoins the two datasets


Combine

Creates new dataset from the combination of two different datasets.  After combining, the output dataset will have all rows and columns from both datasets mixed together. The tool is useful for adding new records to an existing dataset, e.g., every month, when new sensor readings are available. 

This procedure can create undesirable duplicate records for identical timestamps found in both datasets. Therefore the user must choose which overlapping records will stay in the output dataset (either from the first or from the second dataset). If columns in matching records intersect, the output values will be filled by first or second dataset respectively.  Columns outside the intersection will be filled with NaN values. Optionally, the second dataset can be restricted by time period, so you have more control over which records will be added. The same effect can be achieved during the file import procedure.


#

Component

Description

1

First dataset

Select one from project datasets.

2

Second dataset

Select one from project datasets.

3

Restrict the second dataset

Optionally, the second dataset can be restricted by time period, so you have more control over which records will be added.

4

If data overlaps keep

For overlapping timestamps, select the dataset which will have the priority in the output dataset.

5

Name of the output dataset

Output dataset name. The name is suggested to avoid duplicate names in the selected database.

6

Database

Database for storing the output dataset.

7CombineCombines the two datasets