PGFS functions for Pipelines

Using the PGFS functions

The PGFS extension provides a set of functions to create and manage storage locations.

Creating a storage location

We start with creating a storage location. A storage location is a reference to a location in an external file system. You can create a storage location with the pgfs.create_storage_location function.

select pgfs.create_storage_location('my_storage', 's3://my_bucket');

The create_strorage_location function takes a name for the storage location and then a URL for the location. The URL should be prefixed with s3: for an S3-compatible bucket or file: for a local file system.

The function also takes an optional msl_id parameter, which is not used.

Creating a storage location with options and credentials

There are two other optional parameters, options and credentials. The options parameter is a JSON object that can be used to pass additional options to the storage location. The credentials parameter is a JSON object that can be used to pass credentials to the storage location.

For example, you can create a storage location with options and credentials like this:

select pgfs.create_storage_location('my_storage', 's3://my_private_bucket', null, '{"region": "eu-west"}'::JSONB, '{"access_key_id": "youraccesskeyid", "secret_access_key":"yoursecretaccesskey"}'::JSONB);

Once you have created a storage location, you can use it to create foreign tables and access files in the external file system. To use it with aidb, you will need to create a volume from the storage location. To do that see creating a volume documentation.

Listing storage locations

You can list all storage locations with the pgfs.list_storage_locations function.

select * from pgfs.list_storage_locations();

This will return a table of currently defined storage locations. Credentials will only be shown if the user has the necessary permissions, otherwise the column will be NULL.

Getting a storage location

You can get the details of a specific storage location with the pgfs.get_storage_location function.

select * from pgfs.get_storage_location('my_storage');

This will return the details of the storage location with the name my_storage.

Updating a storage location

You can also update a storage location with the pgfs.update_storage_location function.

select pgfs.update_storage_location('my_storage', 's3://my_bucket', null, '{"region": "eu-west"}'

Dropping a storage location

You can drop a storage location with the pgfs.drop_storage_location function.

select pgfs.drop_storage_location('my_storage');

This will remove the storage location with the name my_storage from the database.


Could this page be better? Report a problem or suggest an addition!