Description:
Convert records from one Avro schema to another schema with including support for flattening and simple type conversions.
Tags:
avro, convert, kite
Properties:
In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the Expression Language Guide
Name | Default Value | Allowable Values | Description |
Input Schema |
Avro Schema of Input Flowfiles Supports Expression Language: true |
||
Output Schema |
Avro Schema of Output Flowfiles Supports Expression Language: true |
||
Locale | default | Locale to use for scanning data " default" for JVM default |
Dynamic Properties:
Dynamic Properties allow the user to specify both the name and value of a property.
Name | Value | Description |
Field name from input schema | Field name for output schema | Explicit mappings from input schema to output schema, which supports renaming fields and stepping into nested records on the input schema using notation like parent.id |
Relationships:
Name | Description |
success | Avro content that converted successfully |
failure | Avro content that failed to convert |
Reads Attributes:
None specified.
Writes Attributes:
None specified.
Summary:
This processor is used to convert data between two Avro formats, such as those coming from the ConvertCSVToAvro orConvertJSONToAvro processors. The input and output content of the flow files should be Avro data files. The processor includes support for the following basic type conversions:
-
Anything to String, using the data’s default String representation
-
String types to numeric types int, long, double, and float
-
Conversion to and from optional Avro types
In addition, fields can be renamed or unpacked from a record type by using the dynamic properties.
Mapping Example:
Throughout this example, we will refer to input data with the following schema:
"type": "record",
"name": "CustomerInput",
"namespace": "org.apache.example",
"fields":[
{
"name": "id",
"type": "string"
},
{
"name": "companyName",
"type": ["null", "string"],
"default": null
},
{
"name": "revenue",
"type": ["null", "string"],
"default": null</br>
},
{
"name" : "parent",
"type" : [ "null", {
"type" : "record",
"name" : "parent",
"fields" : [ {
"name" : "name",
"type" : ["null", "string"],
"default" : null
}, {</br>
"name" : "id",
"type" : "string"
} ]
} ],
"default" : null
}
]
Where even though the revenue and id fields are mapped as string, they are logically long and double respectively. By default, fields with matching names will be mapped automatically, so the following output schema could be converted without using dynamic properties:
"type": "record",
"name": "SimpleCustomerOutput",
"namespace": "org.apache.example",
"fields": [
{
"name": "id",
"type": "long"
},
{
"name": "companyName",
"type": ["null", "string"],
"default": null</br>
},
{
"name": "revenue",
"type": ["null", "double"],
"default": null
}
]
To rename companyName to name and to extract the parent’s id field, both a schema and a dynamic properties must be provided. For example, to convert to the following schema:
"type": "record",
"name": "SimpleCustomerOutput",
"namespace": "org.apache.example",
"fields": [
{
"name": "id",
"type": "long"
},
{
"name": "name",
"type": ["null", "string"],
"default": null
},
{
"name": "revenue",
"type": ["null", "double"],
"default": null
},
{
"name": "parentId",
"type": ["null", "long"],
"default": null
}
]
The following dynamic properties would be used:
"companyName" -> "name"</br>
"parent.id" -> "parentId"