Description and usage of FetchGridFS processor:

Retrieves one or more files from a GridFS bucket by file name or by a user-defined query.

Tags:

fetch, gridfs, mongo

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the Expression Language Guide.

Name

Default Value

Allowable Values

Description

Client Service

Controller Service API: 


MongoDBClientService

Implementation: 

MongoDBControllerService


The MongoDB client service to use for database connections.

Mongo Database Name

The name of the database to use

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


Bucket Name The GridFS bucket where the files will be stored. If left blank, it will use the default value 'fs' that the MongoDB client driver uses.

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


File Name The name of the file in the bucket that is the target of this processor.

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


Query A valid MongoDB query to use to fetch one or more files from GridFS.

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


Query Output Attribute If set, the query will be written to a specified attribute on the output flowfiles.

Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)


Operation Mode

all-at-once * Full Query Fetch 
* Stream Query Results 
This option controls when results are made available to downstream processors. If Stream Query Results is enabled, provenance will not be tracked relative to the input flowfile if an input flowfile is received and starts the query. In Stream Query Results mode errors will be handled by sending a new flowfile with the original content and attributes of the input flowfile to the failure relationship. Streaming should only be used if there is reliable connectivity between MongoDB and NiFi.

Relationships:

Name

Description

success When the operation succeeds, the flowfile is sent to this relationship.
failure When there is a failure processing the flowfile, it goes to this relationship.
original The original input flowfile goes to this relationship if the query does not cause an error

Reads Attributes:

None specified.

Writes Attributes:

Name

Description

gridfs.file.metadata The custom metadata stored with a file is attached to this property if it exists.

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

None specified.

Summary:

This processor retrieves one or more files from GridFS. The query can be provided in one of three ways:

  • Query configuration parameter.
  • Built for you by configuring the filename parameter. (Note: this is just a filename, Mongo queries cannot be embedded in the field).
  • Retrieving the query from the flowfile contents.

The processor can also be configured to either commit only once at the end of a fetch operation or after each file that is retrieved. Multiple commits is generally only necessary when retrieving a lot of data from GridFS as measured in total data size, not file count, to ensure that the disks NiFi is using are not overloaded.