Description and usage of PutSolrRecord processor:
Indexes the Records from a FlowFile into Solr
Tags:
Apache, Solr, Put, Send, Record
Properties:
In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, whether a property supports the Expression Language Guide, and whether a property is considered “sensitive”, meaning that its value will be encrypted. Before entering a value in a sensitive property, ensure that the nifi.properties file has an entry for the property nifi.sensitive.props.key.
Name |
Default Value |
Allowable Values |
Description |
Solr Type |
Standard |
* Cloud ![]() * Standard ![]() |
The type of Solr instance, Cloud or Standard. |
Solr Location |
The Solr url for a Solr Type of Standard (ex: http://localhost:8984/solr/gettingstarted), or the ZooKeeper hosts for a Solr Type of Cloud (ex: localhost:9983). Supports Expression Language: true (will be evaluated using variable registry only) |
||
Collection |
The Solr collection name, only used with a Solr Type of Cloud Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
||
Solr Update Path |
/update |
The path in Solr to post the Flowfile Records Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
|
Record Reader |
Controller Service API: RecordReaderFactory Implementations: AvroReader SyslogReader ScriptedReader XMLReader GrokReader Syslog5424Reader CSVReader JsonTreeReader JsonPathReader |
Specifies the Controller Service to use for parsing incoming data and determining the data's schema. | |
Fields To Index |
Comma-separated list of field names to write Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
||
Commit Within | 5000 |
The number of milliseconds before the given update is committed Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
|
Kerberos Credentials Service |
Controller Service API: KerberosCredentialsService Implementation: KeytabCredentialsService |
Specifies the Kerberos Credentials Controller Service that should be used for authenticating with Kerberos | |
Username |
The username to use when Solr is configured with basic authentication. Supports Expression Language: true (will be evaluated using variable registry only) |
||
Password |
The password to use when Solr is configured with basic authentication. Sensitive Property: true Supports Expression Language: true (will be evaluated using variable registry only) |
||
SSL Context Service |
Controller Service API: SSLContextService Implementations: StandardRestrictedSSLContextService StandardSSLContextService |
The Controller Service to use in order to obtain an SSL Context. This property must be set when communicating with a Solr over https. | |
Solr Socket Timeout |
10 seconds | The amount of time to wait for data on a socket connection to Solr. A value of 0 indicates an infinite timeout. | |
Solr Connection Timeout |
10 seconds | The amount of time to wait when establishing a connection to Solr. A value of 0 indicates an infinite timeout. | |
Solr Maximum Connections |
10 | The maximum number of total connections allowed from the Solr client to Solr. | |
Solr Maximum Connections Per Host |
5 | The maximum number of connections allowed from the Solr client to a single Solr host. | |
ZooKeeper Client Timeout | 10 seconds | The amount of time to wait for data on a connection to ZooKeeper, only used with a Solr Type of Cloud. | |
ZooKeeper Connection Timeout | 10 seconds | The amount of time to wait when establishing a connection to ZooKeeper, only used with a Solr Type of Cloud. | |
Batch Size | 500 |
The number of solr documents to index per batch Supports Expression Language: true (will be evaluated using flow file attributes and variable registry) |
Dynamic Properties:
Dynamic Properties allow the user to specify both the name and value of a property.
Name |
Value |
Description |
A Solr request parameter name | A Solr request parameter value |
These parameters will be passed to Solr on the request Supports Expression Language: false |
Relationships:
Name |
Description |
success | The original FlowFile |
failure | FlowFiles that failed for any reason other than Solr being unreachable |
connection_failure | FlowFiles that failed because Solr is unreachable |
Reads Attributes:
None specified.
Writes Attributes:
None specified.
State management:
This component does not store state.
Restricted:
This component is not restricted.
Input requirement:
This component requires an incoming relationship.
System Resource Considerations:
None specified.
Summary
Usage Example
This processor reads the NiFi record and indexes it into Solr as a SolrDocument. Any properties added to this processor by the user are passed to Solr on the update request. It is required that the input record reader should be specified for this processor. Additionally, if only selected fields of a record are to be indexed you can specify the field name as a comma-separated list under the fields property.
Example: To specify specific fields of the record to be indexed:
- Fields To Index: field1,field2,field3
NOTE: In case of nested the field names should be prefixed with the parent field name.
- Fields To Index: parentField1,parentField2,</b>parentField3_childField1,parentField3_childField2</b>
In case of nested records, this processor would flatten all the nested records into a single solr document, the field name of the field in a child document would follow the format of {Parent Field Name}_{Child Field Name}.
Example: For a record created from the following json:
{
"first": "John",
"last": "R",
"grade": 8,
"exams": {
"subject": "Maths",
"test" : "term1",
"marks" : 90
}
}
The corresponding solr document would be represented as below:
{
"first": "John",
"last": "R",
"grade": 8,
"exams_subject": "Maths",
"exams_test" : "term1",
"exams_marks" : 90
}
Similarly in case of an array of nested records, this processor would flatten all the nested records into a single solr document, the field name of the field in a child document would follow the format of {Parent Field Name}_{Child Field Name} and would be a multivalued field in the solr document. Example: For a record created from the following json:
{
"first": "John",
"last": "R",
"grade": 8,
"exams": [
{
"subject": "Maths",
"test" : "term1",
"marks" : 90
},
{
"subject": "Physics",
"test" : "term1",
"marks" : 95
}
]
}
The corresponding solr document would be represented as below:
{
"first": "John",
"last": "R",
"grade": 8,
"exams_subject": ["Maths","Physics"]
"exams_test" : ["term1","term1"]
"exams_marks" : [90,95]
}