Columbus Worker API

Contents:

colorker package

A library for distributed execution of workflows submitted through Columbus. The library helps connect with Google cloud services through the API methods provided in the service package. The library methods are intended to be used in the code composed for Components and Combiners of the Columbus platform. Several methods in the library have the parameter user_settings - these are sent to the worker by the Columbus master, so code composed inside Components and Combiners will have access to the user settings internally and the parameter can be ignored.

colorker.security

Includes functionality to obtain access to appropriate Cloud services based on the given credentials of a user

class colorker.security.CredentialManager
static get_earth_engine(user_settings=None)

Obtains the Google Earth Engine object that can be used to do GIS computations.

Parameters:user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Return type:object
Returns:The Earth Engine object

colorker.utils

Includes utility functions that can be used in the code composed in Components and Combiners of the Columbus platform

colorker.utils.caught(try_function, *args)

Tries a function and checks if it throws an exception.

Parameters:
  • try_function (Callable) – callable object representing the function that must be tried
  • args (list) – arguments to pass to the callable function
Return type:

bool

Returns:

True if an exception was caught, False otherwise

colorker.utils.current_time_millis()

Gets the current time in milliseconds

Return type:int
Returns:current time in milliseconds
colorker.utils.deep_update(source, overrides)

Updates a nested dictionary or similar mapping. Modifies source in place with the key-value pairs in overrides

Parameters:
  • source (dict) – a dictionary that needs to be updated
  • overrides (dict) – a dictionary that provides the new keys and values
Return type:

dict

Returns:

updated source dictionary

colorker.utils.dict_to_html_table(a_dict)

Converts a dictionary to a HTML table. Keys become the header of the table. Useful while sending email from the code

Parameters:a_dict (dict) – key value pairs in the form of a dictionary
Returns:HTML table representation corresponding to the values in the dictionary
Return type:str
colorker.utils.dicts_to_html_table(a_list)

Converts a list of dictionaries to a HTML table. Keys become the header of the table. Useful while sending email from the code

Parameters:a_list (list(dict)) – values in the form of list of dictionaries
Returns:HTML table representation corresponding to the values in the lists
Return type:str
colorker.utils.is_number(s)

Checks if the argument is a number

Parameters:s (str) – Any string
Return type:bool
Returns:True if the string is a number, False otherwise
colorker.utils.lists_to_html_table(a_list)

Converts a list of lists to a HTML table. First list becomes the header of the table. Useful while sending email from the code

Parameters:a_list (list(list)) – values in the form of list of lists
Returns:HTML table representation corresponding to the values in the lists
Return type:str
colorker.utils.mean(prop, ftc)

Finds the mean of a property in the given feature collection. NaN values are treated as zero.

Parameters:
  • prop (str) – name of the property in the feature collection
  • ftc (geojson.FeatureCollection) – the feature collection containing that property
Returns:

mean value of the property

Return type:

float

colorker.utils.std(prop, ftc)

Finds the standard deviation of a property in the given feature collection. NaN values are treated as zero.

Parameters:
  • prop (str) – name of the property in the feature collection
  • ftc (geojson.FeatureCollection) – the feature collection containing that property
Returns:

standard deviation value of the property

Return type:

float

Subpackages

colorker.service package

The package provides APIs that help connect with Google cloud services. The library methods are intended to be used in the code composed for Components and Combiners of the Columbus platform. Several methods in the library have the parameter user_settings - these are sent to the worker by the Columbus master, so code composed inside Components and Combiners will have access to the user settings internally and the parameter can be ignored.

colorker.service.bigquery

Includes functions to integrate with Google Bigquery. The results and implementation is based on the API provided by the Google Bigquery API:

https://developers.google.com/resources/api-libraries/documentation/bigquery/v2/python/latest/index.html

colorker.service.bigquery.get_all_tables(user_settings=None)

Obtains all the table names from all the bigquery projects.

Parameters:user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Returns:[{project_name:dataset_name : [table_name_1, table_name_2]}]
Return type:list(dict)
colorker.service.bigquery.get_features(qualified_table_name, user_settings=None)

Obtains the columns of a bigquery table

Parameters:
  • qualified_table_name (str) – table name, must be of the form project_name:dataset_name.table_name
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Returns:

List of key value pairs where key is column name and value is column type

Return type:

list

colorker.service.bigquery.get_query_results(qualified_table_name, query, user_settings=None)

Obtains the results of a query. A call to this method will block until the results are obtained

Parameters:
  • qualified_table_name (str) – table name, must be of the form project-name:dataset-name.table-name
  • query (str) – A SQL query that conforms to the syntax of Bigquery query
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Returns:

{fields: [{name: column_name_1, type:column_type_1}, ...], rows: [[{v:column_1_value}, {v:column_2_value}, ...], [{v:column_1_value}, {v:column_2_value}, ...]], total: total_number_of_rows, cached: boolean, whether the results returned were obtained from cache}

Return type:

dict

colorker.service.drive

Includes functions to integrate with a user’s Google drive. The results and implementation is based on the API provided by the Google Drive API:

https://developers.google.com/drive/v3/reference/

https://developers.google.com/resources/api-libraries/documentation/drive/v3/python/latest/

colorker.service.drive.get_file_contents(file_id, meta_err=False, user_settings=None)

Obtains the contents of a file as a list of dictionaries. File type of the requested file must be a csv or a Google fusion table.

Parameters:
  • file_id (str) – the identifier of the file whose content is needed
  • meta_err (bool) – optional, internal use only
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Returns:

list of dictionaries where each dictionary is a row in the file

Return type:

list

colorker.service.drive.get_metadata(file_id, user_settings=None)

Obtains the metadata of a file

Parameters:
  • file_id (str) – the identifier of the file whose metadata is needed
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Returns:

metadata of the file including id, mimeType, size, parents, kind, fileExtension, and webContentLink

colorker.service.email

Includes functions that facilitate sending an email to the requested people

colorker.service.email.send_mail(receivers, subject, message, html=None)

Sends an email to the recipients. Must be called from an EngineThread. This method will not raise any exception if it fails to send a message to the recipients.

Parameters:
  • receivers (list(str)) – list of recipient email addresses
  • subject (str) – subject of the email
  • message (str) – plain text message
  • html (str) – HTML message
colorker.service.fusiontables

Includes functions to integrate with Google Fusion Tables. The results and implementation is based on the API provided by the Google Fusion Tables API:

https://developers.google.com/resources/api-libraries/documentation/fusiontables/v2/python/latest/index.html

colorker.service.fusiontables.create_table(name, description, columns, data=None, share_with=None, admin=None, user_settings=None)

Creates a fusion table for the given data and returns the table id.

Parameters:
  • name (str) – Name of the fusion table to create
  • description (str) – Description of the table to be created
  • columns (list(dict)) – List of dictionaries having properties name and type
  • data (list(dict)) – List of dictionaries (optional)
  • share_with (str or list(str)) – Single email addreess string or a List of user email addresses (gmail only) to share the created fusion table
  • admin (str) – email address of the administrator who should have edit access to the created fusion table
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Return type:

str

Returns:

the table id of the created fusion table

colorker.service.fusiontables.delete_table(table_id, user_settings=None)

Deletes a fusion table

Parameters:
  • table_id (str) – identifier of the fusion table
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Raises BaseException:
 

Any exception resulting from this operation

colorker.service.fusiontables.read_table(table_id, user_settings=None)

Reads a fusion table and returns its contants as a list of dictionaries

Parameters:
  • table_id (str) – identifier of the fusion table
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Raises BaseException:
 

Any exception resulting from this operation

colorker.service.gee

Includes functions that integrate Google cloud storage and Google Earth Engine

colorker.service.gee.delete_object(bucket, filename, user_settings=None, access='storage_rw')

Deletes an object from the specified bucket

Parameters:
  • bucket (str) – name of the Google cloud storage bucket
  • filename (str) – path of the object in the bucket
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
  • access – must be ‘storage’. Other values are for internal use only
Returns:

Returns the response obtained from the API

colorker.service.gee.download_object(bucket, filename, out_file, user_settings=None, access='storage')

Downloads an object from the Google cloud storage bucket of the user specified on the account page of Columbus

Parameters:
  • bucket (str) – name of the Google cloud storage bucket
  • filename (str) – path of the object in the bucket
  • out_file (str) – path to store the downloaded file. If you don’t know the path, use /tmp
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
  • access (str) – must be ‘storage’. Other values are for internal use
colorker.service.gee.get_bucket_metadata(bucket, user_settings=None, access='storage')

Retrieves metadata about the given bucket.

Parameters:
  • bucket (str) – name of the Google cloud storage bucket
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
  • access – must be ‘storage’. Other values are for internal use only
Returns:

Returns the response obtained from the API by uploading the object

colorker.service.gee.get_geojson(ftc)

Returns the geojson representation of the ee.FeatureCollection. This function must be called from an EngineThread

Parameters:ftc – an instance of ee.FeatureCollection
Raises Exception:
 Any exception resulting from this operation
colorker.service.gee.list_bucket(bucket, user_settings=None, access='storage')

Returns a list of metadata of the objects within the given bucket.

Parameters:
  • bucket (str) – name of the Google cloud storage bucket
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
  • access – must be ‘storage’. Other values are for internal use only
Return type:

list

Returns:

List of object paths

colorker.service.gee.upload_object(bucket, filename, readers, owners, user_settings=None, access='storage_rw')

Uploads the specified file to the specified bucket. The object path in the bucket is same as the path of the file specified.

Parameters:
  • bucket (str) – Name of the cloud storage bucket
  • filename (str) – fully qualified name of the file to upload
  • readers (list(str)) – list of email addresses
  • owners (list(str)) – list of email addresses
  • user_settings (dict) – optional, a dictionary of user credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
  • access (str) – must be ‘storage’. Other values are for internal use only
Returns:

Returns the response obtained from the API by uploading the object