abtools.mongodb: Working with MongoDB

abtools.mongodb.get_connection(ip='localhost', port=27017, user=None, password=None)

Returns a pymongo MongoClient object.

Parameters:
  • ip (str) – IP address of the MongoDB server. Default is localhost.
  • port (int) – Port of the MongoDB server. Default is 27017.
  • user (str) – Username, if authentication is enabled on the MongoDB database. Default is None, which results in requesting the connection without authentication.
  • password (str) – Password, if authentication is enabled on the MongoDB database. Default is None, which results in requesting the connection without authentication.
abtools.mongodb.get_db(db, ip='localhost', port=27017, user=None, password=None)

Returns a pymongo Database object.

Parameters:
  • db (str) – Name of the MongoDB database. Required.
  • ip (str) – IP address of the MongoDB server. Default is localhost.
  • port (int) – Port of the MongoDB server. Default is 27017.
  • user (str) – Username, if authentication is enabled on the MongoDB database. Default is None, which results in requesting the connection without authentication.
  • password (str) – Password, if authentication is enabled on the MongoDB database. Default is None, which results in requesting the connection without authentication.
abtools.mongodb.get_collections(db, collection=None, prefix=None, suffix=None)

Returns a sorted list of collection names found in db.

Parameters:
  • db (Database) – A pymongo Database object. Can be obtained with get_db.
  • collection (str) – Name of a collection. If the collection is present in the MongoDB database, a single-element list will be returned with the collecion name. If not, an empty list will be returned. This option is primarly included to allow for quick checking to see if a collection name is present. Default is None, which results in this option being ignored.
  • prefix (str) – If supplied, only collections that begin with prefix will be returned.
  • suffix (str) – If supplied, only collections that end with suffix will be returned.
Returns:

A sorted list of collection names.

Return type:

list

abtools.mongodb.rename_collection(db, collection, new_name)

Renames a MongoDB collection.

Parameters:
  • db (Database) – A pymongo Database object. Can be obtained with get_db.
  • collection (str) – Name of the collection to be renamed.
  • new_name (str, func) –

    new_name can be one of two things:

    1. The new collection name, as a string.
    2. A function which, when passed the current collection name,
        returns the new collection name. If the function
        returns an empty string, the collection will not be
        renamed.
    
abtools.mongodb.update(field, value, db, collection, match=None)

Updates MongoDB documents.

Sets field equal to value for all documents that meet match criteria.

Parameters:
  • field (str) – Field to update.
  • value (str) – Update value.
  • db (Database) – A pymongo Database object.
  • collection (str) – Collection name.
  • match (dict) –

    A dictionary containing the match criteria, for example:

    {'seq_id': {'$in': ['a', 'b', 'c']}, 'cdr3_len': {'$gte': 18}}
    
abtools.mongodb.unset(db, collection, field, match=None)

Removes field from all records in collection that meet match criteria.

Parameters:
  • field (str) – Field to be removed.
  • db (Database) – A pymongo Database object.
  • collection (str) – Collection name.
  • match (dict) –

    A dictionary containing the match criteria, for example:

    {'seq_id': {'$in': ['a', 'b', 'c']}, 'cdr3_len': {'$gte': 18}}
    
abtools.mongodb.mongoimport(json, database, ip='localhost', port=27017, user=None, password=None, delim='_', delim1=None, delim2=None, delim_occurance=1, delim1_occurance=1, delim2_occurance=1)

Performs mongoimport on one or more json files.

Parameters:
  • json

    Can be one of several things:

    • path to a single JSON file
    • an iterable (list or tuple) of one or more JSON file paths
    • path to a directory containing one or more JSON files
  • database (str) – Name of the database into which the JSON files will be imported
  • ip (str) – IP address of the MongoDB server. Default is localhost.
  • port (int) – Port of the MongoDB database. Default is 27017.
  • user (str) – Username for the MongoDB database, if authentication is enabled. Default is None, which results in attempting connection without authentication.
  • password (str) – Password for the MongoDB database, if authentication is enabled. Default is None, which results in attempting connection without authentication.
  • delim (str) – Delimiter, when generating collection names using a single delimiter. Default is _
  • delim_occurance (int) – Occurance at which to split filename when using a single delimiter. Default is 1
  • delim1 (str) – Left delimiter when splitting with two delimiters. Default is None.
  • delim1_occurance (int) – Occurance of delim1 at which to split filename. Default is 1
  • delim2 (str) – Right delimiter when splitting with two delimiters. Default is None.
  • delim2_occurance (int) – Occurance of delim2 at which to split filename. Default is 1
abtools.mongodb.index(db, collection, fields, directions=None, desc=False, background=False)

Builds a simple (single field) or complex (multiple fields) index on a single collection in a MongoDB database.

Parameters:
  • db (Database) – A pymongo Database object.
  • collection (str) – Collection name.
  • fields

    Can be one of two things:

    • the name of a single field, as a string
    • an iterable (list/tuple) of one or more field names
  • desc (bool) – If True, all indexes will be created in descending order. Default is False.
  • directions (list) – For complex indexes for which you’d like to have different indexing directions (ascending for some fields, descending for others), you can pass a list of pymongo direction objects ( pymongo.ASCENDING and pymongo.DESCENDING), in the same order as the list of fields to be indexed. Must be the same length as the list of index fields. Default is None.
  • background (bool) – If True, the indexing operation will be processed in the background. When performing background indexes, the MongoDB database will not be locked.
abtools.mongodb.remove_padding(db, collection, field='padding')

Removes a padding field.

Parameters:
  • db (Database) – A pymongo Database object.
  • collection (str) – Collection name
  • field (str) – Name of the padding field. Default is padding