...
The Analyst calculator is a powerful tool for data editing and calculation. The calculator includes tools for column creating creation and deletingdeletion, calculation tools with common functions, and data selection:
- Create new columns.
- Calculate and write values into columns. Choose a column for calculation, insert the expression, and click on the "Calculate" button. Values in the column will be rewritten by new values.
- Apply calculations on selected records only. Select the record in the table view, open the context menu (right-click), and choose "Add to selection". Write the expression into the box for selection and press the "Apply selection" button.
- Explore columns. Click on the column in the list and press the "Describe column" button.
Check your results before saving. Choose table view and explore table records
Save changes. Click on the "Save" button
Info |
---|
You can drag a column from the column list and drop it into one of the expression boxes. Column expression will be correctly transformed into processing form. This behavior is implemented in the unique values list too. Predefined functions are possible to drag and drop only into the calculation expression box. |
...
# | Component | Description |
1 | List of columns | List of columns available for expression construction. For inserting to the expression you can drag the list item and drop into one of the expression boxes or just double-click on the name and the column will be added to the expression with the cursor in the right form. |
2 | New | Opens the dialog for creating a new column in the dataset. |
Delete | Removes selected column from dataset | |
3 | Expression box for calculation | Insert expression for calculation. Calculation of values will be applied on selected records only or on all records. Below the expression box are buttons with operands. |
4 | Expression buttons | Click on the buttons to insert symbols in the expression box |
Clear | Clears expression field | |
5 | Functions | List of predefined functions which are possible to use in expression boxes |
6 | Table view | The table view shows selected records - highlighted by yellow color. Table view includes a context menu with functions for manual selection and unselection deselection of records. |
7 | Clear selection | Clears selected records |
8 | Calculate | Launch calculation |
...
Type | Description | Expression |
Missing solar irradiance components | GHI - Global horizontal irradiance | col["DNI"] * sin(radians(sun_elevation)) + col["DIF"] |
DNI - Direct normal irradiance | (col["GHI"] - col["DIF"]) / sin(radians(sun_elevation)) | |
DIF - Diffuse horizontal irradiance | col["GHI"] - col["DNI"] * sin(radians(sun_elevation)) | |
Kt - Clearness index | col["GHI"] / ETR | |
Ktb | col["DNI"] / ETR | |
D2G - Diffuse to global ratio | col["DIF"] / col["GHI"] | |
Unit conversions | Temperature Celzius Celsius to Fahrenheit | col["TEMP"] * 1.8 + 32 |
Temperature Fahrenheit to CelziusCelsius | (col["TEMP"] - 32)*(5/9) | |
Irradiation kWh/m2 to kJ/m2 | col["DNI"] * 3600 | |
Irradiation kJ/m2 to kWh/m2 | col["DNI"] / 3600 | |
Datetime selection | From a specific date (year) | col["datetime"] > "2018" |
From a specific date (year) | col["datetime"] > "2017-10" | |
From a specific date (year) | col["datetime"] > "2017-10-30" | |
From a specific date (year) | col["datetime"] > "2017-09-30 10:35:00" | |
Exact date (year)* | col["datetime"] == "2017-09-30" | |
Excludes date* | col["datetime"] != "2017-09-30" | |
Combination using logical operators | (col["datetime"] > "2017-09-30 10:35:00") And (col["datetime"] < "2017-09-30 12:15:00") | |
Combination using logical operators | (col["datetime"] > "2017-09-30") And (col["datetime"] < "2017-10-01") | |
Combination using logical operators | (col["datetime"] > "10:00") And (col["datetime"] < "15:00") | |
Column selection | Combination using logical operators | (( col["GHI"] + col["DNI"] ) > 1200) And (col["TEMP"] < 30.2) |
Combination using logical operators | (( col["GHI"] + col["DNI"] ) > 1200) And (Not( col["TEMP"] > 30.2)) | |
Mixed selection | Combination using logical operators | (Not((col["datetime"] > "09:00") And (col["datetime"] < "15:00"))) Or (col["GHI"] > 800 ) And (col["DNI"] < 500) |
...
MEAN_WEIGHTED aggregation takes into consideration the situation around interval borders. At the beginning of the process, weighted MEAN aggregation function densifies original records (see the picture 'MEAN_WEIGHTED aggregation rule'). The rule makes ten new records of the same value from one original record - giving the weight of each value by its occurrence. Then, the arithmetic mean is calculated.
Join
Adds columns of values from one dataset to another dataset for all matching records. Matching is based on timestamps. As a result of this tool, new output dataset is created in the selected database.
...
Combine datasets
Create a new dataset by combining two different datasets. The tool is useful for adding new records and columns to an existing dataset, e.g., every month, when new sensor readings are available.
Select the primary dataset and the secondary dataset you want to combine. The order of datasets is important! The output dataset will contain the full copy of the first primary dataset data (no loss) plus all column values or additional rows from the second secondary dataset (for matching records). If a column with the same name exists in both datasets, the column values in the first dataset will not be overwritten. Instead, such column will be renamed and added as a new column.
The number of rows after the join operation is always the same as in the first dataset (left table). If there is no single matching timestamp between the datasets, the tool will exit rather than create the exact copy of the first dataset. Typically, after the join operation, you will visit the Metadata editor to properly describe newly added columns or drop unnecessary columns. For adding both rows and columns, please use the Combine tool.
inc-drawio. As a result of this tool, a new output dataset is created in the selected database.
Drawio | ||||||
---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
# | Component | Description |
1 | First Primary datasetFirst dataset - the one having all data in the output dataset | Select a dataset in current project. |
2 | Second Secondary dataset | Second dataset - the dataset which columns will be joined. Only column values for matching records will appear in the output dataset. Conflicting column names will be resolved by adding suffixes. |
3 | Keep rows from both datasets | Records from the second dataset will be append to the first dataset in case of timestamp mismatch. |
4 | Name of the output dataset | Select a dataset in current project. |
3 | Clip secondary dataset | Optionally, the secondary dataset can be restricted by a period, so you have more control over which records will be added. |
4 | Name of combined dataset | Output dataset name. The suggested name is suggested to avoid avoids duplicate names in the selected database or in current project. |
5 | Database | Database for storing the output datasetSelect database to store the combined dataset. By default, it is the database where the primary dataset is stored. |
6 | Join | Joins the two datasets |
Combine
Creates new dataset from the combination of two different datasets. After combining, the output dataset will have all rows and columns from both datasets mixed together. The tool is useful for adding new records to an existing dataset, e.g., every month, when new sensor readings are available.
This procedure can create undesirable duplicate records for identical timestamps found in both datasets. Therefore the user must choose which overlapping records will stay in the output dataset (either from the first or from the second dataset). If columns in matching records intersect, the output values will be filled by first or second dataset respectively. Columns outside the intersection will be filled with NaN values. Optionally, the second dataset can be restricted by time period, so you have more control over which records will be added. The same effect can be achieved during the file import procedure.
Drawio | ||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
# | Component | Description | |
1 | First dataset | Select one from project datasets. | |
2 | Second dataset | Select one from project datasets. | |
3 | Restrict the second dataset | Optionally, the second dataset can be restricted by time period, so you have more control over which records will be added. | |
4 | If data overlaps keep | For overlapping timestamps, select the dataset which will have the priority in the output dataset. | |
5 | Name of the output dataset | Output dataset name. The name is suggested to avoid duplicate names in the selected database. | |
6 | Database | Database for storing the output dataset. | |
7 | Combine | Combines the two datasetsCombine records | Option to select which rows in the secondary dataset should be filtered out
|
7 | Combine columns | Option to select which columns in the secondary dataset should be filtered out
| |
8 | Combine columns with the same name | Option to handle columns with the same name
| |
9 | Note | In case the value for one particular time stamp exists in both primary and secondary datasets, value from the primary dataset is taken and the value from the secondary dataset is ignored. | |
10 | Combine | Combines the two datasets |
Measured values
There are multiple options how to combine datasets:
- skip columns or rows in the secondary dataset that do not exist in the primary dataset, or
- include all columns or all rows from the secondary dataset
There are two options how to handle case when a column with the same name exists in both datasets:
- columns from the secondary dataset will be renamed and added as a new column, or
- columns can be merged into one column
In case the value for one particular time stamp exists in both the primary and secondary dataset, value from the primary dataset is taken and the value from the secondary dataset is ignored.
Optionally, the secondary dataset can be restricted by a period, so you have more control over which records will be added.
Visualizations of different combined dataset options are below:
Flags
Flag columns are combined with the same logic as measured columns (parameters). Flag column names are taken from the primary dataset. In case there are flag columns with identical names, some columns will be automatically renamed.
In case there are flag columns that are not assigned to any parameter, these columns will be part of the combined dataset. Such flag columns will not be merged, when there are columns with identical names, they are always renamed.
Maintenance log
Maintenance logs are combined as well. Based on parametrization maintenance log from the secondary dataset can be clipped to a shorter date range.
QC status
QC statuses are combined too. QC status is removed in case both primary and secondary datasets do not have particular QC status. It means that one of the datasets didn't have quality control done, and therefore also combined dataset is considered that it didn't have quality control done.
When is combining datasets not allowed
In the following cases, it is not allowed to combine datasets:
Database of the primary, secondary, or combined dataset is not available
One of the datasets is empty
In case the option to merge columns is selected, the following scenarios are not allowed:
Merging columns of the same name but different parameter type
Merging GTI columns of the same name but different GTI configuration
Merging columns of the same name but values in different units
Warning while combining datasets
Users are informed about the following cases before datasets are combined. It is up to the user to reconsider whether to continue and combine datasets or to cancel the operation and check the dataset or metadata first.
Reason for a warning | Combined dataset |
---|---|
Site location that is not identical. | Combined dataset will have location from the primary dataset |
Merging GTI columns with same name, but one dataset has a missing GTI config. | Combined dataset will contain the existing GTI config on the column. |
The time step is not identical. | Combined dataset will have the smaller time step from both datasets. |
The instrument name is not the same. | Combined dataset will have instruments from the primary dataset. |
The primary or secondary dataset contains flag column(s) not linked to any parameter. | Combined dataset contains all the flag columns which are not linked to any column |
To see more details about which columns are affected, use the "Show Details..." option.