Columbus Worker API


colorker package

A library for distributed execution of workflows submitted through Columbus. The library helps connect with Google cloud services through the API methods provided in the service package. The library methods are intended to be used in the code composed for Components and Combiners of the Columbus platform. Several methods in the library have the parameter user_settings - these are sent to the worker by the Columbus master, so code composed inside Components and Combiners will have access to the user settings internally and the parameter can be ignored.

Includes functionality to obtain access to appropriate Cloud services based on the given credentials of a user

static get_earth_engine(user_settings=None)

Obtains the Google Earth Engine object that can be used to do GIS computations.

Parameters:user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Return type:object
Returns:The Earth Engine object


Includes utility functions that can be used in the code composed in Components and Combiners of the Columbus platform

colorker.utils.caught(try_function, *args)

Tries a function and checks if it throws an exception.

  • try_function (Callable) – callable object representing the function that must be tried
  • args (list) – arguments to pass to the callable function
Return type:



True if an exception was caught, False otherwise


Gets the current time in milliseconds

Return type:int
Returns:current time in milliseconds
colorker.utils.deep_update(source, overrides)

Updates a nested dictionary or similar mapping. Modifies source in place with the key-value pairs in overrides

  • source (dict) – a dictionary that needs to be updated
  • overrides (dict) – a dictionary that provides the new keys and values
Return type:



updated source dictionary


Converts a dictionary to a HTML table. Keys become the header of the table. Useful while sending email from the code

Parameters:a_dict (dict) – key value pairs in the form of a dictionary
Returns:HTML table representation corresponding to the values in the dictionary
Return type:str

Converts a list of dictionaries to a HTML table. Keys become the header of the table. Useful while sending email from the code

Parameters:a_list (list(dict)) – values in the form of list of dictionaries
Returns:HTML table representation corresponding to the values in the lists
Return type:str

Checks if the argument is a number

Parameters:s (str) – Any string
Return type:bool
Returns:True if the string is a number, False otherwise

Converts a list of lists to a HTML table. First list becomes the header of the table. Useful while sending email from the code

Parameters:a_list (list(list)) – values in the form of list of lists
Returns:HTML table representation corresponding to the values in the lists
Return type:str
colorker.utils.mean(prop, ftc)

Finds the mean of a property in the given feature collection. NaN values are treated as zero.

  • prop (str) – name of the property in the feature collection
  • ftc (geojson.FeatureCollection) – the feature collection containing that property

mean value of the property

Return type:


colorker.utils.std(prop, ftc)

Finds the standard deviation of a property in the given feature collection. NaN values are treated as zero.

  • prop (str) – name of the property in the feature collection
  • ftc (geojson.FeatureCollection) – the feature collection containing that property

standard deviation value of the property

Return type:



colorker.service package

The package provides APIs that help connect with Google cloud services. The library methods are intended to be used in the code composed for Components and Combiners of the Columbus platform. Several methods in the library have the parameter user_settings - these are sent to the worker by the Columbus master, so code composed inside Components and Combiners will have access to the user settings internally and the parameter can be ignored.


Includes functions to integrate with Google Bigquery. The results and implementation is based on the API provided by the Google Bigquery API:


Obtains all the table names from all the bigquery projects.

Parameters:user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Returns:[{project_name:dataset_name : [table_name_1, table_name_2]}]
Return type:list(dict)
colorker.service.bigquery.get_features(qualified_table_name, user_settings=None)

Obtains the columns of a bigquery table

  • qualified_table_name (str) – table name, must be of the form project_name:dataset_name.table_name
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings

List of key value pairs where key is column name and value is column type

Return type:


colorker.service.bigquery.get_query_results(qualified_table_name, query, user_settings=None)

Obtains the results of a query. A call to this method will block until the results are obtained

  • qualified_table_name (str) – table name, must be of the form project-name:dataset-name.table-name
  • query (str) – A SQL query that conforms to the syntax of Bigquery query
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings

{fields: [{name: column_name_1, type:column_type_1}, ...], rows: [[{v:column_1_value}, {v:column_2_value}, ...], [{v:column_1_value}, {v:column_2_value}, ...]], total: total_number_of_rows, cached: boolean, whether the results returned were obtained from cache}

Return type:


Includes functions to integrate with a user’s Google drive. The results and implementation is based on the API provided by the Google Drive API:, meta_err=False, user_settings=None)

Obtains the contents of a file as a list of dictionaries. File type of the requested file must be a csv or a Google fusion table.

  • file_id (str) – the identifier of the file whose content is needed
  • meta_err (bool) – optional, internal use only
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings

list of dictionaries where each dictionary is a row in the file

Return type:

list, user_settings=None)

Obtains the metadata of a file

  • file_id (str) – the identifier of the file whose metadata is needed
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings

metadata of the file including id, mimeType, size, parents, kind, fileExtension, and webContentLink

Includes functions that facilitate sending an email to the requested people, subject, message, html=None)

Sends an email to the recipients. Must be called from an EngineThread. This method will not raise any exception if it fails to send a message to the recipients.

  • receivers (list(str)) – list of recipient email addresses
  • subject (str) – subject of the email
  • message (str) – plain text message
  • html (str) – HTML message

Includes functions to integrate with Google Fusion Tables. The results and implementation is based on the API provided by the Google Fusion Tables API:

colorker.service.fusiontables.create_table(name, description, columns, data=None, share_with=None, admin=None, user_settings=None)

Creates a fusion table for the given data and returns the table id.

  • name (str) – Name of the fusion table to create
  • description (str) – Description of the table to be created
  • columns (list(dict)) – List of dictionaries having properties name and type
  • data (list(dict)) – List of dictionaries (optional)
  • share_with (str or list(str)) – Single email addreess string or a List of user email addresses (gmail only) to share the created fusion table
  • admin (str) – email address of the administrator who should have edit access to the created fusion table
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Return type:



the table id of the created fusion table

colorker.service.fusiontables.delete_table(table_id, user_settings=None)

Deletes a fusion table

  • table_id (str) – identifier of the fusion table
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Raises BaseException:

Any exception resulting from this operation

colorker.service.fusiontables.read_table(table_id, user_settings=None)

Reads a fusion table and returns its contants as a list of dictionaries

  • table_id (str) – identifier of the fusion table
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
Raises BaseException:

Any exception resulting from this operation


Includes functions that integrate Google cloud storage and Google Earth Engine

colorker.service.gee.delete_object(bucket, filename, user_settings=None, access='storage_rw')

Deletes an object from the specified bucket

  • bucket (str) – name of the Google cloud storage bucket
  • filename (str) – path of the object in the bucket
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
  • access – must be ‘storage’. Other values are for internal use only

Returns the response obtained from the API

colorker.service.gee.download_object(bucket, filename, out_file, user_settings=None, access='storage')

Downloads an object from the Google cloud storage bucket of the user specified on the account page of Columbus

  • bucket (str) – name of the Google cloud storage bucket
  • filename (str) – path of the object in the bucket
  • out_file (str) – path to store the downloaded file. If you don’t know the path, use /tmp
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
  • access (str) – must be ‘storage’. Other values are for internal use
colorker.service.gee.get_bucket_metadata(bucket, user_settings=None, access='storage')

Retrieves metadata about the given bucket.

  • bucket (str) – name of the Google cloud storage bucket
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
  • access – must be ‘storage’. Other values are for internal use only

Returns the response obtained from the API by uploading the object


Returns the geojson representation of the ee.FeatureCollection. This function must be called from an EngineThread

Parameters:ftc – an instance of ee.FeatureCollection
Raises Exception:
 Any exception resulting from this operation
colorker.service.gee.list_bucket(bucket, user_settings=None, access='storage')

Returns a list of metadata of the objects within the given bucket.

  • bucket (str) – name of the Google cloud storage bucket
  • user_settings (dict) – optional, A dictionary of settings specifying credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
  • access – must be ‘storage’. Other values are for internal use only
Return type:



List of object paths

colorker.service.gee.upload_object(bucket, filename, readers, owners, user_settings=None, access='storage_rw')

Uploads the specified file to the specified bucket. The object path in the bucket is same as the path of the file specified.

  • bucket (str) – Name of the cloud storage bucket
  • filename (str) – fully qualified name of the file to upload
  • readers (list(str)) – list of email addresses
  • owners (list(str)) – list of email addresses
  • user_settings (dict) – optional, a dictionary of user credentials for appropriate services. If one is not provided, then this method must be invoked by an EngineThread which defines the settings
  • access (str) – must be ‘storage’. Other values are for internal use only

Returns the response obtained from the API by uploading the object