Description and usage of FetchGCSObject:
Fetches a file from a Google Cloud Bucket. Designed to be used in tandem with ListGCSBucket.
Tags:
google cloud, google, storage, gcs, fetch
Properties:
In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, whether a property supports the NiFi Expression Language, and whether a property is considered “sensitive”, meaning that its value will be encrypted. Before entering a value in a sensitive property, ensure that the nifi.properties file has an entry for the property nifi.sensitive.props.key.
Name | Default Value | Allowable Values | Description |
GCP Credentials Provider Service |
Controller Service API: GCPCredentialsService Implementations: GCPCredentialsControllerService |
The Controller Service used to obtain Google Cloud Platform credentials. | |
Project ID | Google Cloud Project ID | ||
Number of retries | 6 | How many retry attempts should be made before routing to the failure relationship. | |
Bucket | ${gcs.bucket} | Bucket of the object.</br> Supports Expression Language: true | |
Key | ${filename} | Name of the object.</br> Supports Expression Language: true | |
Object Generation | The generation of the Object to download. If null, will download latest generation.</br> Supports Expression Language: true | ||
Server Side Encryption Key | An AES256 Key (encoded in base64) which the object has been encrypted in.</br> Sensitive Property: true</br> Supports Expression Language: true |
Relationships:
Name | Description |
success | FlowFiles are routed to this relationship after a successful Google Cloud Storage operation. |
failure | FlowFiles are routed to this relationship if the Google Cloud Storage operation fails. |
Reads Attributes:
None specified.
Writes Attributes:
Name | Description |
filename | The name of the file, parsed if possible from the Content-Disposition response header |
gcs.bucket | Bucket of the object. |
gcs.key | Name of the object. |
gcs.size | Size of the object. |
gcs.cache.control | Data cache control of the object. |
gcs.component.count | The number of components which make up the object. |
gcs.content.disposition | The data content disposition of the object. |
gcs.content.encoding | The content encoding of the object. |
gcs.content.language | The content language of the object. |
mime.type | The MIME/Content-Type of the object |
gcs.crc32c | The CRC32C checksum of object's data, encoded in base64 in big-endian order. |
gcs.create.time | The creation time of the object (milliseconds) |
gcs.update.time | The last modification time of the object (milliseconds) |
gcs.encryption.algorithm | The algorithm used to encrypt the object. |
gcs.encryption.sha256 | The SHA256 hash of the key used to encrypt the object |
gcs.etag | The HTTP 1.1 Entity tag for the object. |
gcs.generated.id | The service-generated for the object |
gcs.generation | The data generation of the object. |
gcs.md5 | The MD5 hash of the object's data encoded in base64. |
gcs.media.link | The media download link to the object. |
gcs.metageneration | The meta generation of the object. |
gcs.owner | The owner (uploader) of the object. |
gcs.owner.type | The ACL entity type of the uploader of the object. |
gcs.uri | The URI of the object as a string. |
State management:
This component does not store state.
Restricted:
This component is not restricted.
Input requirement:
This component does not allow an incoming relationship.
See Also:
ListGCSBucket, PutGCSObject, DeleteGCSObject
How to configure FetchGCSObject?
Step 1: Drag and drop the FetchGCSObject processor to canvas.
Step 2: Double click the processor to configure, the configuration dialog will be opened as follows,
Step 3: Check the usage of each property and update those values.
Properties and usage:
GCP Credentials Provider Service: The Controller Service used to obtain Google Cloud Platform credentials.
Project ID: ID of the Google Cloud Project.
Number of retries: If there is a failure relationship, how many retries should be made.
Bucket: Bucket of the object.
Key: Name of the object.
Object Generation: The generation of the Object to download.
Configure GCPCredentials Controller Service:
Step 1: To access the Google Cloud Platform information,configure GCPCredentials controller service as follow.
Step 2: Now, go to your GCPCredentials Controller service and configure the following properties and enable the controller service.
Use Application Default Credentials: Set the default value as ‘False’.
Use Compute Engine Credentials: Set the default value as ‘False’.
Service Account JSON File: Used to specify the file path of the Service account JSON file that contains Private Key.
To generate a private key in JSON format:
- Open the list of credentials in the Google Cloud Platform Console.
- Click Create credentials.
- Select Service account key.
- Click the drop-down box below service account, the click New Service account.
- Enter a name for the service account in Name.
- Use the default Service account ID or generate a different one.
- Select the Key type: JSON or P12.
- Click create.
- A service account created window is displayed and the private key for the key type you selected is downloaded automatically. If you selected a P12 key, the private key’s password is displayed.
- Click close.
Step 3: After getting the private key in JSON format store it in your local directory and specify the file path in Service Account JSON File property while configuring the GCPCredentials Controller Service as shown in below screen shot.
Step 4: You can also apply private key details as JSON data in Service Account JSON property while configuring the GCPCredentials Controller Service as shown in below screenshot.
Step 5: After configured the controller service enable the controller service and start the processor.