Importer
Importers (Main Menu / Admin / Importer) are used for automated imports from various Data Sources.
Introduction
Importers in Txture are a mighty tool to populate your instance with data from all sorts of data sources. There are four types of importer:
- Asset importer: Used to import assets of any asset type.
- Link importer: Once assets have been imported, a link importer can be used to create dependencies between them.
- Property importer: Oftentimes it is the case that several data sources have information on the same assets (e.g. a Hypervisor and a CMDB). In such a case a property importer can be used to augment data from another source to previously imported assets.
- User importer: User importers allow importing accounts from CSV, Active Directory or other means.
Synchronization
It's critical to understand that importers in Txture are constantly in sync. Txture keeps track of the information source for each asset and each of its properties (e.g. this asset was imported from XY with the ID Z). In the event that the importer executes again, Txture will be able to determine whether the asset is still there in the data source.
- synced: This means an asset was found in the exact same state as already present in Txture. It will remain untouched.
- updated: One or more properties of an asset have changed in the data source. The asset will be updated accordingly.
- removed: The asset is not present in the data source any more and will hence be removed in Txture as well.
For each importer run, a small overview of how many assets have been processed is presented. In addition to above states the importer statistics might also show one of the following two values:
- loaded: Number of data sets found at the data source.
- imported: Assets that have been imported (created) at the importer run.
Importers and the Last Modified Date
As stated earlier, importers in Txture will always synchronize the data in Txture with the data in your datasource.
Your datasource will be treated as the single (read-only) source of truth in this process.
Each importer run will try to minimize the set of changes to apply in Txture.
This means that if the data in Txture is equal to the data reported by your datasource, the importer will not touch it.
This entails that the Last Modified Date will also remain the same as before.
When importing data, the Last Modified Date therefore (literally) corresponds to the last modification, and not the last synchronization.
Creating an importer
Importers use the connection of Data Sources to extract data and allow the user to map the imported data to the internal structure of the Txture instance. When creating a new importer, you first need to type in a descriptive name and hit save. In the following importer configuration view you need to select the data source and the desired importer type (asset, link, user or property) from a drop-down menu.
In the image below you can see the configuration for a Property importer for Assets that is using a Text Data Source. You can choose whether the content is of CSV or JSON format for a text data source.
For data sources which are compatible with only one kind of importer, the last step is done automatically in the background.
Once the data source and type resolution settings have been made, the preview on the right can be loaded to validate the input.
Transform data
After an importer has loaded a preview from the data, the processing can begin. The data transformation features offer a solid ETL layer capable of almost all required data transformation processes.
After configuring the steps, the user can load a transformed preview in which the result of the transformation steps are integrated. For some scenarios a certain order is required for the transformation steps. In this case, you can drag the operations up and down using the dots to the left of each transformation step.
The available transformation steps are:
- Filter script: Filter out rows that are not meant to be imported by writing conditions in Groovy script
- Dynamic columns: Transform data to the required format using Groovy script
- Data enhancers: Augment imported data with additional columns from Txture's technology databases
- Parse column to date: Convert the values of a column to a date type (unix timestamp)
Filter script
The importer allows you to filter the data provided by the data source. This optional step allows to only import a subset of the rows, if the data source does not provide filter mechanisms on its own (e.g. import from CSV).
For filtering, a filter script is needed, which can be defined in the Transformation section after fetching the initial preview.
Filter scripts can be written in Groovy.
The script is applied once per row and must return a boolean value whether a specific row should be processed further or not.
Therefore, the example return false would skip all rows, and return true would keep all rows (same behaviour as with no script at all).
Skipped rows are handled the same way as if they were not provided by the data source in the first place. For example within an asset importer, any skipped asset row will not be imported. Additionally, if the asset has been imported before, the asset will be deleted.
The script has access to the values of the current row via the row object.
It provides access to the values of the other columns via getter methods (see API reference).
If a script fails to execute on a certain row, the error is logged within the importer log and the row is skipped.
In the data preview you can see the result of the filter script.
The column Filtered? tells you whether the row will be imported or not.
You can also use a filter script in combination with other data transformation steps, such as the data enhancer.
In the example below, the column Instance Name.dataCenterKey, added by the data enhancer in the first transformation step is used to filter the dataset by data center.
In the data preview, you can see that only columns with the data center key "azure_europe-west" are imported.
Filter script examples
Filter out local ip addresses:
return !row.getString('ip').startsWith('127') && !row.getString('ip').startsWith('192.168')
Only process rows from Austria:
return row.getString('country') == 'AT'
Skip all rows where a deleted flag is set:
return row.isNull('deleted') == false
Dynamic columns
Dynamic columns can be used to create new columns whose value is defined by a simple Groovy script. They are available for asset-, link- and user-importers.
The dynamic columns can/must be mapped to properties just like normal columns. Each dynamic column is defined by its name and a script that returns its value on execution. The name of the script must be unique. The provided script is executed once per row and must return the value of the column.
Valid return types are: String, numeric values like double, float, int, long, or short, while boolean, Date, and Instant for multi value properties just return an Iterable or Array of the types mentioned above.
The script has access to the values of the current row via the row object (see API reference below).
If a script fails to execute on a certain row, the error is logged within the importer log and the value is set to null.
Dynamic column examples
The name of a server should always be uppercase:
return row.getString('name').toUpperCase();
A column contains a comma separated list of names which should be split into a multi value property:
def list = row.getString('supporters').split(',');
// list now contains something like ['John Doe', 'Janie Roe', ...]
return list;
A column contains a comma separated list of active users which should be counted:
if(row.isNull('active_users')){
return 0;
}else{
def list = row.getString('active_users').split(',')
return list.length;
}
API reference
Show API reference
String row.getString(String columnName)Groovy scripts for filter and dynamic columns have access to the row object.
To access values of other columns the following getter methods are provided.
The argument columnName is a string containing the case in name of the column (as seen in the preview).
return row.getString('fistName')+' '+row.getString('lastname')to have the full name as a new column orreturn row.getString('ip').startsWith('127') == falseto filter out local ip addresses.String get(String columnName)is a simple alias togetString.
Parameters:
columnName | name of the column (as seen in the preview) |
Returns:
boolean getBoolean(String columnName)Interprets the column value as boolean.
Numeric values are interpreted as true if they are != 0.
The literals true and false are parsed to their boolean equivalent.
return row.getBoolean('active') == true to import only active entities
Parameters:
columnName | name of the column (as seen in the preview) |
Returns:
double getNumber(String columnName)Returns the column value as a number of type double.
return row.getNumber('height') >= 10 to skip certain small items.
Parameters:
columnName | name of the column (as seen in the preview) |
Date getDate(String columnName)Returns the column value as a Date. Numeric values are interpreted as milliseconds since 1970-01-01. Strings are parsed as ISO 8601 format.
return row.getDate('sold').getDay() == 1 to only import entities sold on mondays.
Parameters:
columnName | name of the column (as seen in the preview) |
Object getRaw(String columnName)Returns the value of the column without any further conversion or check. This should not be needed normally.
Note that null values must be handled explicitly with row.isNull('name of column'):
If a certain column can contain null (stands for no value) a call to getString, getNumber etc would throw an Exception.
You should therefore test the value if it is null with row.isNull('name of column').
// the name of an imported server might not be set in the data source
if(row.isNull('serverName')){
return 'Unnamed Server'; // this can be used as name for all unnamed servers
}else{
// now it is save to call getString
return row.getString('serverName');
}
Parameters:
columnName | name of the column (as seen in the preview) |
Map<String, String> getTags(String columnName)Returns the value of the column after parsing into a Map<String, String>.
This function assumes that the following characters are not part of the keys or values (which is true for the tags of all major cloud providers):
- JSON characters:
{,},[,]," - Line Breaks
:(except if escaped with\, i.e.\:will be parsed to:)
A variety of formats is supported:
- Plain Text: assumes
:as separator between key and value, and line breaks between pairs. Use\:to include a colon as part of a key or value. - JSON object: each field in the object turns into a key, with the field value as its associated value.
Note that null values must be handled explicitly with row.isNull('name of column'):
If a certain column can contain null (stands for no value) a call to getString, getNumber etc would throw an Exception.
You should therefore test the value if it is null with row.isNull('name of column').
// gets the "cloudTags" as a Map...
def tags = row.getTags('cloudTags')
// ... and removes an undesired key from it
tags.removeKey("irrelevant")
// lowercase a value and re-assign it to the same key
tags["someKey"] = tags["someKey"].toLowerCase()
// return the tags map. It is possible to directly
// assign this to a multi-valued key-value property of an asset or link.
return tags
Parameters:
columnName | name of the column (as seen in the preview) |
API helpers
The row object also provides access to some helpers.
To create a value of type Range the following helpers can be used:
Range createRange(double lowerBound, double upperBound)Creates a range from the given lower- and upper bounds.
If you want to model the allowed temperature range of a server but the data is provided as two separate columns from your data source:
def min = row.getNumber('minTemperature');
def max = row.getNumber('maxTemperature');
// return a Range from two numeric values
return row.createRange(min, max);
If you want to model the allowed temperature range of a server and the data is in a single column:
// column allowedTemperature contains range in the form "xxxxx-yyyyy", e.g. "40-90"
def value = row.getString('allowedTemperature');
def splitValues = value.split('-');
def min = Double.parseDouble(splitValues[0]);
def max = Double.parseDouble(splitValues[1]);
return row.createRange(min, max);
Parameters:
lowerBound | the lower bound of the range |
upperBound | the upper bound of the range |
Range createRangeOpenEnded(double lowerBound)Creates an open ended range from the given lower bound.
Parameters:
lowerBound | the lower bound of the range |
Range createRangeOpenStart(double upperBound)Creates a range from the given upper bound.
Parameters:
upperBound | the upper bound of the range |
CSV in filter scripts and dynamic columns
When working with a row object in a Dynamic Column or Filter script, you can fetch the content of a cell and parse it at the same time.
Please refer to our Scripting section.
Data enhancers
Enhancers can augment the data provided by a Data Source by introducing additional columns. Those additional columns can be used as if they were provided by the datasource itself; they can be used for filter scripts, for dynamic columns or for mapping them directly to Txture properties. The actual data added (and where this data is coming from) depends on the Enhancer.
Enhancers are configured in the Transformation section of an importer and provide the following functionality:
- Resolve technologies: to normalize and categorize technologies, based on Txture's Taxonomy.
- Resolve product instances: to retrieve additional instance specifications (e.g. RAM, CPU cores, ...) from the Taxonomy.
Technology enhancer
The Technology Enhancer is aimed mostly at cloud migration use cases, but is also useful for general technology management. It requires a target column as its configuration. The cell values in the target column will be collected and sent to the Txture Taxonomy, where they will be matched against the catalogue of known technologies. The result are three additional columns (per target column):
-
targetColumnName
.technologyName: The normalized and cleaned technology name which is used in the Txture Taxonomy. This column is useful if you would like to have normalized technology names e.g. for general technology management. When mapping the technology property of an asset, we recommend to use this column. -
targetColumnName
.isRelevant: Boolean value that indicates if the technology is known to be relevant for cloud migrations or not. For example,Apache Tomcatis a relevant technology, butNotepadis not. This column is intended to be used in the filter script in order to reduce the incoming rows only to the relevant ones. -
targetColumnName.
type: The categorization of the technology according to the Txture Taxonomy. This additional column is useful if you want to restrict the imported data to certain technology types, e.g. only databases.
The screenshot below shows the "transformed preview" after applying the Technology Enhancer to the original "Technology" column, with added columns highlighted in orange:
Product instance enhancer
The Product Instance Enhancer retrieves technical specifications of product instances from the Txture Taxonomy.
Therefore, the cell values in the target column and some additional context (provider name, product name, operating system, region name, high availability, license) are collected and sent to the Taxonomy, where they are matched against the catalogue of known product instances.
The result is X additional columns, some of which may be null as the properties differ depending on the asset type:
- targetColumnName
.txtureId: The unique identifier assigned by the Taxonomy to the matched product instance (String). - targetColumnName
.taxonomyId: The taxonomy identifier associated with the product instance in the Taxonomy (String). - targetColumnName
.instanceName: The name of the product instance as retrieved from the Taxonomy (String). - targetColumnName
.providerName: The name of the service provider associated with the product instance (String). - targetColumnName
.productName: The name of the product associated with the product instance (String). - targetColumnName
.assetType: The type of asset corresponding to the product instance (String). - targetColumnName
.cpuCores: The number of CPU cores allocated to the product instance (Integer). - targetColumnName
.ram: The amount of RAM allocated to the product instance (Integer). - targetColumnName
.instanceType: The type or category of the product instance (String). - targetColumnName
.instanceFamily: The family or group to which the product instance belongs (String). - targetColumnName
.operatingSystem: The operating system running on the product instance (String). - targetColumnName
.instanceStorageAvailable: The available storage capacity of the product instance (Integer). - targetColumnName
.instanceStorageDriveType: The type of drive used for storage on the product instance (String). - targetColumnName
.instanceStorageRedundancy: The level of redundancy implemented in the storage of the product instance (String). - targetColumnName
.instanceStorageAccessType: The access type for storage on the product instance (String). - targetColumnName
.instanceStorageIsFlexible: Indicates whether the storage capacity of the product instance is flexible (Boolean). - targetColumnName
.instanceStorageFixedCapacity: The fixed storage capacity allocated to the product instance (Integer). - targetColumnName
.instanceStorageFlexibleMinVolumeSizeInMiB: The minimum volume size for flexible storage in Mebibytes (MiB) (Integer). - targetColumnName
.instanceStorageFlexibleMaxVolumeSizeInMiB: The maximum volume size for flexible storage in Mebibytes (MiB) (Integer). - targetColumnName
.isHighAvailable: Indicates whether the product instance is designed for high availability (Boolean). - targetColumnName
.license: The licensing information associated with the product instance (String).
Product instance enhancer example
The csv data source used in the asset importer consists of five columns:
| ID | Product Name | Instance Name | Operating System | Region Name |
|---|---|---|---|---|
| 6wfv3d0d36 | Azure Database for PostgreSQL | basic-compute-g5-2 | Azure: West Europe | |
| ofe9mk0fw0 | Azure Database for PostgreSQL | basic-compute-g5-2 | Azure: West Europe | |
| x58ixy4b7s | Azure Virtual Machines | Standard B2ts v2 | Red Hat Enterprise Linux | Azure: Germany West Central |
| qgbg1aalnp | Azure Virtual Machines | Standard F4 | Microsoft Windows | Azure: West Europe |
To retrieve additional information, the product instance enhancer can be added and configured in the data transformation section of the asset importer:
Finally, the retrieved columns can be mapped to the Structure.
The column Instance Name.assetType can be used for the dynamic type resolution.
By clicking on Propose mapping the retrieved columns can be mapped to the Structure.
Before running the importer, add additional mappings and make sure that all proposed mappings are correct.
If we don't find your Technology or Product Instance...
If the Txture Taxonomy did not contain an entry for any particular cell value, the added columns will have a default value of null.
You can use the row.isNull(<columnName>) command to check for this in your filter scripts and dynamic column scripts.
You can use the results of the enhancer in your dynamic columns and filter scripts. Here is a typical example:
Changing data sources of an importer
It is also possible to change the data source of an existing importer. This is useful in case you want to keep all configurations you made to an importer. In the Data Source settings of an importer, you can simply change the selected Data source via the dedicated dropdown.
Note that if you have already imported assets before and you want to update them using the new data source, you need to ensure that the same asset ID is provided at the data source. Otherwise, the importer will not be able to identify already existing assets and therefore creates new ones and deletes old assets.
FAQ
What if my importer has a fixed schema, but the information I want to know is not listed?
The importers with a fixed schema are mostly vendor specific importers and they rely on the information that is accessible by the Data Source. When implementing the importers, we tried to get as many information out of the endpoints as possible, but sometimes the endpoints are updated and the data that can be extracted changes. So if this is the case for you, please contact our Support and tell us what data you would like to import. We will check for updates of the vendor and if possible can add a column for your desired data.