MongoDB connection pool and container implementation for Zope3
Project description
This package provides a mongodb object mapper framework including zope transaction support based on some core zope component libraries. This package can get used with or without zope.persistent and as a full replacement for the ZODB. The package is not heavy based on zope itself and can get used in any python project which requires a bridge from mongodb to python object.
README
IMPORTANT: If you run the tests with the –all option a real mongodb stub server will start at port 45017!
This package provides non persistent MongoDB object implementations. They can simply get mixed with persistent.Persistent and contained.Contained if you like to use them in a mixed MongoDB/ZODB application setup. We currently use this framework as ORM (object relation mapper) where we map MongoDB objects to python/zope schema based objects including validation etc.
In our last project, we started with a mixed ZODB/MongoDB application where we mixed persistent.persistent into IMongoContainer objects. But later we where so exited about the performance and stability that we removed the ZODB persistence layer at all. Now we use a ZODB less setup in our application where we start with a non persistent item as our application root. All required tools where we use for such a ZODB less application setup are located in the p01.publisher and p01.recipe.setup package.
NOTE: Some of this test use a fake mongodb located in m01/mongo/testing and some other tests will use our mongdb stub from the m01.stub package. You can run the tests with the –all option if you like to run the full tests which will start and stop the mongodb stub server.
NOTE: All mongo item interfaces will not provide ILocation or IContained but the base mongo item implementations will implement Location which provides the ILocation interface directly. This makes it simpler for permission declaration in ZCML.
Setup
>>> import pymongo >>> import zope.component >>> from m01.mongo import interfaces
MongoClient
Setup a mongo client:
>>> client = pymongo.MongoClient('localhost', 45017) >>> client MongoClient(host=['127.0.0.1:45017'])
As you can see the client is able to access the database:
>>> db = client.m01MongoTesting >>> db Database(MongoClient(host=['127.0.0.1:45017']), u'm01MongoTesting')
A data base can retrun a collection:
>>> collection = db['m01MongoTest'] >>> collection Collection(Database(MongoClient(host=['127.0.0.1:45017']), u'm01MongoTesting'), u'm01MongoTest')
As you can see we can write to the collection:
>>> res = collection.update_one({'_id': '123'}, {'$inc': {'counter': 1}}, ... upsert=True) >>> res <pymongo.results.UpdateResult object at ...>>>> res.raw_result {'updatedExisting': False, 'nModified': 0, 'ok': 1, 'upserted': '123', 'n': 1}
And we can read from the collection:
>>> collection.find_one({'_id': '123'}) {u'_id': u'123', u'counter': 1}
Remove the result from our test collection:
>>> res = collection.delete_one({'_id': '123'}) >>> res <pymongo.results.DeleteResult object at ...>>>> res.raw_result {'ok': 1, 'n': 1}
tear down
Now tear down our MongoDB database with our current MongoDB connection:
>>> import time >>> time.sleep(1) >>> client.drop_database('m01MongoTesting')
MongoContainer
The MongoContainer can store IMongoContainerItem objects in a MongoDB. A MongoContainerItem must be able to dump it’s data to valid mongodb data. This test will show how our MongoContainer works.
Condition
First import some components:
>>> import json >>> import transaction >>> import zope.interface >>> import zope.schema >>> import m01.mongo.item >>> import m01.mongo.testing >>> from m01.mongo.fieldproperty import MongoFieldProperty >>> from m01.mongo import interfaces
Befor we start testing, check if our thread local cache is empty or if we have left over some junk from previous tests:
>>> from m01.mongo import LOCAL >>> m01.mongo.testing.pprint(LOCAL.__dict__) {}
Setup
And set up a database root:
>>> root = {}
MongoContainerItem
>>> class ISampleContainerItem(interfaces.IMongoContainerItem, ... zope.location.interfaces.ILocation): ... """Sample item interface.""" ... ... title = zope.schema.TextLine( ... title=u'Object Title', ... description=u'Object Title', ... required=True)>>> class SampleContainerItem(m01.mongo.item.MongoContainerItem): ... """Sample container item""" ... ... zope.interface.implements(ISampleContainerItem) ... ... title = MongoFieldProperty(ISampleContainerItem['title']) ... ... dumpNames = ['title']
MongoContainer
>>> class ISampleContainer(interfaces.IMongoContainer): ... """Sample container interface.""">>> class SampleContainer(m01.mongo.container.MongoContainer): ... """Sample container.""" ... ... zope.interface.implements(ISampleContainer) ... ... @property ... def collection(self): ... db = m01.mongo.testing.getTestDatabase() ... return db['test'] ... ... def load(self, data): ... """Load data into the right mongo item.""" ... return SampleContainerItem(data)>>> container = SampleContainer() >>> root['container'] = container
Create an object tree
Now we can add a sample MongoContainerItem to our container using the mapping api:
>>> data = {'title': u'Title'} >>> item = SampleContainerItem(data) >>> container = root['container'] >>> container[u'item'] = item
Transaction
Zope provides transactions for store objects in the database. We also provide such a transaction and a transation data manager for store our objects in the mongodb. This means right now nothing get stored in our test database because we didn’t commit the transaction:
>>> collection = m01.mongo.testing.getTestCollection() >>> collection.count() 0
Let’s commit our transaction an store the container item in mongodb:
>>> transaction.commit()>>> collection = m01.mongo.testing.getTestCollection() >>> collection.count() 1
After commit, the thread local storage is empty:
>>> LOCAL.__dict__ {}
Mongodb data
As you can see the following data get stored in our mongodb:
>>> data = collection.find_one({'__name__': 'item'}) >>> m01.mongo.testing.pprint(data) {u'__name__': u'item', u'_id': ObjectId('...'), u'_pid': None, u'_type': u'SampleContainerItem', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'modified': datetime.datetime(..., tzinfo=UTC), u'title': u'Title'}
Object
We can get from our container and mongo will load the data from mongodb:
>>> obj = container[u'item'] >>> obj <SampleContainerItem u'item'>>>> obj.title u'Title'
Let’s tear down our test setup:
>>> transaction.commit() >>> from m01.mongo import clearThreadLocalCache >>> clearThreadLocalCache()
As you can see our cache items get removed:
>>> from m01.mongo import LOCAL >>> m01.mongo.testing.pprint(LOCAL.__dict__) {}
MongoStorage
The MongoStorage can store IMongoStorageItem objects in a MongoDB. A MongoStorageItem must be able to dump it’s data to valid mongo values. This test will show how our MongoStorage works and also shows the limitations.
Note: the mongo container also implements a container/mapping pattern like the storage implementation. The only difference is, the container only provides the mapping api using contaner[key] = obj, container[key] and del container[key]. The storage api provides no explicit mapping key and offers add and remove methods instead. This means the container uses it’s own naming pattern and the storage is using the mongodb._id as it’s object name (obj.__name__).
Condition
Befor we start testing, check if our thread local cache is empty or if we have let over some junk from previous tests:
>>> from m01.mongo import LOCAL >>> from m01.mongo.testing import pprint >>> pprint(LOCAL.__dict__) {}
Setup
First import some components:
>>> import datetime >>> import transaction >>> from zope.container.interfaces import IReadContainer >>> from m01.mongo import interfaces >>> from m01.mongo import testing
And set up a database root:
>>> root = {}
MongoStorageItem
The mongo item provides by default a ObjectId stored as _id. If there is none given during create an object, we will set one:
>>> data = {} >>> obj = testing.SampleStorageItem(data) >>> obj._id ObjectId('...')
The ObjectId is also use as our __name__ value. See the MongoContainer and MongoContainerItem implementation if you need to choose your own names:
>>> obj.__name__ u'...'>>> obj.__name__ == unicode(obj._id) True
A mongo item also provides created and modified date attributes. If we initialize an object without a given created date, a new utc datetime instance get used:
>>> obj.created datetime.datetime(..., tzinfo=UTC)>>> obj.modified is None True
A mongo storage item knows if a state get changed. This means we can find out if we should write the item back to the MongoDB. The MongoItem stores the state in a _m_changed value like persistent objects do in _p_changed. As you can see the initial state is `None:
>>> obj._m_changed is None True
The MongoItem also has a version number which we increment each time we change the item. By default this version is set as _version attribute and set by default to 0 (zero):
>>> obj._version 0
If we change a value in a MongoItem, the state get changed:
>>> obj.title = u'New Title' >>> obj._m_changed True
but the version get not imcremented. We only imcrement the version if we save the item in MongoDB:
>>> obj._version 0
We also change the _m_change marker if we remove a value:
>>> obj = testing.SampleStorageItem(data) >>> obj._m_changed is None True>>> obj.title u''>>> obj.title = u'New Title' >>> obj._m_changed True>>> obj.title u'New Title'
Now let’s set the _m_chande property set to False before we delete the attr:
>>> obj._m_changed = False >>> obj._m_changed False>>> del obj.title
As you can see we can delete an attribute but it only falls back to the default schema field value. This seems fine.
>>> obj.title u''>>> obj._m_changed True
MongoStorage
Now we can add a MongoStorage to the zope datbase:
>>> storage = testing.SampleStorage() >>> root['storage'] = storage >>> transaction.commit()
Now we can add a sample MongoStorageItem to our storage. Note we can only use the add method which will return the new generated __name__. Using own names is not supported by this implementation. As you can see the name is an MongoDB 24 hex character string objectId representation.
>>> data = {'title': u'Title', ... 'description': u'Description'} >>> item = testing.SampleStorageItem(data) >>> storage = root['storage']
Our storage provides the IMongoStorage and IReadContainer interfaces:
>>> interfaces.IMongoStorage.providedBy(storage) True>>> IReadContainer.providedBy(storage) True
add
We can add a mongo item to our storage by using the add method.
>>> __name__ = storage.add(item) >>> __name__ u'...' >>> len(__name__) 24>>> transaction.commit()
After adding our item, the item provides a created date:
>>> item.created datetime.datetime(..., tzinfo=UTC)
__len__
>>> storage = root['storage'] >>> len(storage) 1
__getitem__
>>> item = storage[__name__] >>> item <SampleStorageItem ...>
As you can see our MongoStorageItem provides the following data. We can dump the item. Note, you probaly have to implement a custom dump method which will dump the right data for you MongoStorageItem.
>>> pprint(item.dump()) {'__name__': '...', '_id': ObjectId('...'), '_pid': None, '_type': 'SampleStorageItem', '_version': 1, 'comments': [], 'created': datetime.datetime(..., tzinfo=UTC), 'date': None, 'description': 'Description', 'item': None, 'modified': datetime.datetime(..., tzinfo=UTC), 'number': None, 'numbers': [], 'title': 'Title'}
The object provides also a name which is the name we’ve got during adding the object:
>>> item.__name__ == __name__ True
keys
The container can also return key:
>>> tuple(storage.keys()) (u'...',)
values
The container can also return values:
>>> tuple(storage.values()) (<SampleStorageItem ...>,)
items
The container can also return items:
>>> tuple(storage.items()) ((u'...', <SampleStorageItem ...>),)
__delitem__
As next we will remove the item:
>>> del storage[__name__] >>> storage.get(__name__) is None True>>> transaction.commit()
Object modification
If we get a mongo item from a storage and modify the item, the version get increased by one and a current modified datetime get set.
Let’s add a new item:
>>> data = {'title': u'A Title', ... 'description': u'A Description'} >>> item = testing.SampleStorageItem(data) >>> __name__ = storage.add(item) >>> transaction.commit()
Now get the item:
>>> item = storage[__name__] >>> item.title u'A Title'
and change the titel:
>>> item.title = u'New Title' >>> item.title u'New Title'
As you can see the item get marked as changed:
>>> item._m_changed True
Now get the mongo item version. This should be set to 1 (one) since we only added the object and didn’t change since we added them:
>>> item._version 1
If we now commit the transaction, the version get increased by one:
>>> transaction.commit() >>> item._version 2
If you now load the mongo item from the MongoDB aain, you can see that the title get changed:
>>> item = storage[__name__] >>> item.title u'New Title'
And that the version get updated to 2:
>>> item._version 2>>> transaction.commit()
Check our thread local cache before we leave this test:
>>> pprint(LOCAL.__dict__) {}
MongoObject
A MongoObject can get stored independent from anything else in a MongoDB. Such MongoObject can get used together with a field property called MongoOjectProperty. The field property is responsible for set and get such MongoObject to and from MongoDB. A persistent item which provides such a MongoObject within a MongoObjectProperty only has to provide an oid attribute with a unique value. You can use the m01.oid package for such a unique oid or implement an own pattern.
The MongoObject uses the __parent__._moid and the attribute (field) name as it’s unique MongoDB key.
Note, this test uses a fake MongoDB server setup. But this fake server is far away from beeing complete. We will add more feature to this fake server if we need them in other projects. See testing.py for more information.
Condition
Befor we start testing, check if our thread local cache is empty or if we have let over some junk from previous tests:
>>> from m01.mongo.testing import pprint >>> from m01.mongo import LOCAL >>> pprint(LOCAL.__dict__) {}
Setup
First import some components:
>>> import datetime >>> import transaction >>> from m01.mongo import interfaces >>> from m01.mongo import testing
First, we need to setup a persistent object:
>>> content = testing.Content(42) >>> content._moid 42
And add them to the ZODB:
>>> root = {} >>> root['content'] = content >>> transaction.commit()>>> content = root['content'] >>> content <Content 42>
MongoObject
Now let’s add a MongoObject instance to our sample content object:
>>> data = {'title': u'Mongo Object Title', ... 'description': u'A Description', ... 'item': {'text':u'Item'}, ... 'date': datetime.date(2010, 2, 28).toordinal(), ... 'numbers': [1,2,3], ... 'comments': [{'text':u'Comment 1'}, {'text':u'Comment 2'}]} >>> obj = testing.SampleMongoObject(data) >>> obj._id ObjectId('...')obj.title u’Mongo Object Title’
>>> obj.description u'A Description'>>> obj.item <SampleSubItem u'...'>>>> obj.item.text u'Item'>>> obj.numbers [1, 2, 3]>>> obj.comments [<SampleSubItem u'...'>, <SampleSubItem u'...'>]>>> tuple(obj.comments)[0].text u'Comment 1'>>> tuple(obj.comments)[1].text u'Comment 2'
Our MongoObject doesn’t provide a _aprent__ or __name__ right now:
>>> obj.__parent__ is None True>>> obj.__name__ is None True
But after adding the mongo object to our content which uses a MongoObjectProperty, the mongo object get located and becomes the attribute name as _field value. If the object didn’t provide a __name__, the same value will also get applied for __name__:
>>> content.obj = obj >>> obj.__parent__ <Content 42>>>> obj.__name__ u'obj'>>> obj.__name__ u'obj'
After adding our mongo object, there should be a reference in our thread local cache:
>>> pprint(LOCAL.__dict__) {u'42:obj': <SampleMongoObject u'obj'>, 'MongoTransactionDataManager': <m01.mongo.tm.MongoTransactionDataManager object at ...>}
A MongoObject provides a _oid attribute which is used as the MongoDB key. This value uses the __parent__._moid and the mongo objects attribute name:
>>> obj._oid == '%s:%s' % (content._moid, obj.__name__) True>>> obj._oid u'42:obj'
Now check if we can get the mongo object again and if we still get the same values:
>>> obj = content.obj >>> obj.title u'Mongo Object Title'>>> obj.description u'A Description'>>> obj.item <SampleSubItem u'...'>>>> obj.item.text u'Item'>>> obj.numbers [1, 2, 3]>>> obj.comments [<SampleSubItem u'...'>, <SampleSubItem u'...'>]>>> tuple(obj.comments)[0].text u'Comment 1'>>> tuple(obj.comments)[1].text u'Comment 2'
Now let’s commit the transaction which will store the obj in our fake mongo DB:
>>> transaction.commit()
After we commited to the MongoDB, the mongo object and our transaction data manger reference should be gone in the thread local cache:
>>> pprint(LOCAL.__dict__) {}
Now check our mongo object values again. If your content item is stored in a ZODB, you would get the content item from a ZODB connection root:
>>> content = root['content'] >>> content <Content 42>>>> obj = content.obj >>> obj <SampleMongoObject u'obj'>>>> obj.title u'Mongo Object Title'>>> obj.description u'A Description'>>> obj.item <SampleSubItem u'...'>>>> obj.item.text u'Item'>>> obj.numbers [1, 2, 3]>>> obj.comments [<SampleSubItem u'...'>, <SampleSubItem u'...'>]>>> tuple(obj.comments)[0].text u'Comment 1'>>> tuple(obj.comments)[1].text u'Comment 2'>>> pprint(obj.dump()) {'__name__': u'obj', '_field': u'obj', '_id': ObjectId('...'), '_oid': u'42:obj', '_type': u'SampleMongoObject', '_version': 1, 'comments': [{'_id': ObjectId('...'), '_type': u'SampleSubItem', 'created': datetime.datetime(...), 'modified': None, 'text': u'Comment 1'}, {'_id': ObjectId('...'), '_type': u'SampleSubItem', 'created': datetime.datetime(...), 'modified': None, 'text': u'Comment 2'}], 'created': datetime.datetime(...), 'date': 733831, 'description': u'A Description', 'item': {'_id': ObjectId('...'), '_type': u'SampleSubItem', 'created': datetime.datetime(...), 'modified': None, 'text': u'Item'}, 'modified': datetime.datetime(...), 'number': None, 'numbers': [1, 2, 3], 'removed': False, 'title': u'Mongo Object Title'}>>> transaction.commit()>>> pprint(LOCAL.__dict__) {}
Now let’s replace the existing item with a new one and add another item to the item lists. Also make sure we can use append instead of re-apply the full list like zope widgets do:
>>> content = root['content'] >>> obj = content.obj>>> obj.item = testing.SampleSubItem({'text': u'New Item'})>>> newItem = testing.SampleSubItem({'text': u'New List Item'}) >>> obj.comments.append(newItem)>>> obj.numbers.append(4)>>> transaction.commit()
check again:
>>> content = root['content'] >>> obj = content.obj>>> obj.title u'Mongo Object Title'>>> obj.description u'A Description'>>> obj.item <SampleSubItem u'...'>>>> obj.item.text u'New Item'>>> obj.numbers [1, 2, 3, 4]>>> obj.comments [<SampleSubItem u'...'>, <SampleSubItem u'...'>]>>> tuple(obj.comments)[0].text u'Comment 1'>>> tuple(obj.comments)[1].text u'Comment 2'
And now re-apply a full list of values to the list field:
>>> comOne = testing.SampleSubItem({'text': u'First List Item'}) >>> comTwo = testing.SampleSubItem({'text': u'Second List Item'}) >>> comments = [comOne, comTwo] >>> obj.comments = comments >>> obj.numbers = [1,2,3,4,5] >>> transaction.commit()
check again:
>>> content = root['content'] >>> obj = content.obj>>> len(obj.comments) 2>>> obj.comments [<SampleSubItem u'...'>, <SampleSubItem u'...'>]>>> len(obj.numbers) 5>>> obj.numbers [1, 2, 3, 4, 5]
Also check if we can remove list items:
>>> obj.numbers.remove(1) >>> obj.numbers.remove(2)>>> obj.comments.remove(comTwo)>>> transaction.commit()
check again:
>>> content = root['content'] >>> obj = content.obj>>> len(obj.comments) 1>>> obj.comments [<SampleSubItem u'...'>]>>> len(obj.numbers) 3>>> obj.numbers [3, 4, 5]>>> transaction.commit()
We can also remove items from the item list by it’s __name__:
>>> content = root['content'] >>> obj = content.obj>>> del obj.comments[comOne.__name__]>>> transaction.commit()
check again:
>>> content = root['content'] >>> obj = content.obj>>> len(obj.comments) 0>>> obj.comments []>>> transaction.commit()
Or we can add items to the item list by name:
>>> content = root['content'] >>> obj = content.obj>>> obj.comments[comOne.__name__] = comOne>>> transaction.commit()
check again:
>>> content = root['content'] >>> obj = content.obj>>> len(obj.comments) 1>>> obj.comments [<SampleSubItem u'...'>]>>> transaction.commit()
Coverage
Our items list also provides the following methods:
>>> obj.comments.__contains__(comOne.__name__) True>>> comOne.__name__ in obj.comments True>>> obj.comments.get(comOne.__name__) <SampleSubItem u'...'>>>> obj.comments.keys() == [comOne.__name__] True>>> obj.comments.values() <generator object ...>>>> tuple(obj.comments.values()) (<SampleSubItem u'...'>,)>>> obj.comments.items() <generator object ...>>>> tuple(obj.comments.items()) ((u'...', <SampleSubItem u'...'>),)>>> obj.comments == obj.comments True
Let’s test some internals for increase coverage:
>>> obj.comments._m_changed Traceback (most recent call last): ... AttributeError: _m_changed is a write only property>>> obj.comments._m_changed = False Traceback (most recent call last): ... ValueError: Can only dispatch True to __parent__>>> obj.comments.locate(42)
Our simple value typ list also provides the following methods:
>>> obj.numbers.__contains__(3) True>>> 3 in obj.numbers True>>> obj.numbers == obj.numbers True>>> obj.numbers.pop() 5>>> del obj.numbers[0]>>> obj.numbers[0] = 42>>> obj.numbers._m_changed Traceback (most recent call last): ... AttributeError: _m_changed is a write only property>>> obj.numbers._m_changed = False Traceback (most recent call last): ... ValueError: Can only dispatch True to __parent__
Check our thread local cache before we leave this test:
>>> pprint(LOCAL.__dict__) {}
GeoLocation
The GeoLocation item can store a geo location and is used in an item as a kind of sub item providing longitude and latitude. Additional to this fields a GeoLocation provides the _m_changed dispatching concept and is able to notify the __parent__ item if lon/lat get changed. The item also provides ILocation for security lookup support. The field property is responsible for apply a __parent__ and __name__.
The GeoLocation item supports the order longitude, latitude and preserves them.
Condition
Befor we start testing, check if our thread local cache is empty or if we have let over some junk from previous tests:
>>> from m01.mongo.testing import pprint >>> from m01.mongo import LOCAL >>> from m01.mongo.testing import reNormalizer >>> pprint(LOCAL.__dict__) {}
Setup
First import some components:
>>> import datetime >>> import transaction>>> import m01.mongo >>> import m01.mongo.base >>> import m01.mongo.geo >>> import m01.mongo.container >>> from m01.mongo import interfaces >>> from m01.mongo import testing
We also need a application root object. Let’s define a static MongoContainer as our application database root item.
>>> class MongoRoot(m01.mongo.container.MongoContainer): ... """Mongo application root""" ... ... _id = m01.mongo.getObjectId(0) ... ... def __init__(self): ... pass ... ... @property ... def collection(self): ... return testing.getRootItems() ... ... @property ... def cacheKey(self): ... return 'root' ... ... def load(self, data): ... """Load data into the right mongo item.""" ... return testing.GeoSample(data) ... ... def __repr__(self): ... return '<%s %s>' % (self.__class__.__name__, self._id)
The following method allows us to generate new MongoRoot item instances. This allows us to show that we generate different root items like we would do on a server restart.
>>> def getRoot(): ... return MongoRoot()
Here is our database root item:
>>> root = getRoot() >>> root <MongoRoot 000000000000000000000000>>>> root._id ObjectId('000000000000000000000000')
indexing
First setup an index:
>>> collection = testing.getRootItems()>>> from pymongo import GEO2D >>> collection.create_index([('lonlat', GEO2D)]) u'lonlat_2d'
GeoSample
As you can see, we can initialize a GeoLocation within a list of lon/lat values or within a lon/lat dict:
>>> data = {'name': u'sample', 'lonlat': {'lon': 1, 'lat': 3}} >>> sample = testing.GeoSample(data) >>> sample.lonlat <GeoLocation lon:1.0, lat:3.0>>>> data = {'name': u'sample', 'lonlat': [1, 3]} >>> sample = testing.GeoSample(data) >>> sample.lonlat <GeoLocation lon:1.0, lat:3.0>>>> root[u'sample'] = sample>>> transaction.commit()
Let’s check our item in Mongo:
>>> data = collection.find_one({'name': 'sample'}) >>> reNormalizer.pprint(data) {u'__name__': u'sample', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': [1.0, 3.0], u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'}
We can also use a GeoLocation as lonlat data:
>>> geo = m01.mongo.geo.GeoLocation({u'lat': 4, u'lon': 2}) >>> data = {'name': u'sample2', 'lonlat': geo} >>> sample2 = testing.GeoSample(data) >>> root[u'sample2'] = sample2>>> transaction.commit()>>> data = collection.find_one({'name': 'sample2'}) >>> reNormalizer.pprint(data) {u'__name__': u'sample2', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'lat': 4.0, u'lon': 2.0}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample2'}
We can also set a GeoLocation as lonlat value:
>>> sample2 = root[u'sample2'] >>> geo = m01.mongo.geo.GeoLocation({'lon': 4, 'lat': 6}) >>> sample2.lonlat = geo>>> transaction.commit()>>> data = collection.find_one({'name': 'sample2'}) >>> reNormalizer.pprint(data) {u'__name__': u'sample2', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoSample', u'_version': 2, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'lat': 6.0, u'lon': 4.0}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample2'}
search
Let’s test some geo location search query and make sure our lon/lat order will fit and get preserved during the mongodb roundtrip.
Now seearch for a geo location:
>>> def printFind(collection, query): ... for data in collection.find(query): ... reNormalizer.pprint(data)
Using the geospatial index we can find documents near another point:
>>> printFind(collection, {'lonlat': {'$near': [0, 2]}}) {u'__name__': u'sample', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': [1.0, 3.0], u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'} {u'__name__': u'sample2', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoSample', u'_version': 2, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'lat': 6.0, u'lon': 4.0}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample2'}
It’s also possible to query for all items within a given rectangle (specified by lower-left and upper-right coordinates):
>>> printFind(collection, {'lonlat': {'$within': {'$box': [[1,2], [2,3]]}}}) {u'__name__': u'sample', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': [1.0, 3.0], u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'}
As you can see if we use the wrong order for lon/lat (lat/lon), we will not get a value:
>>> printFind(collection, {'lonlat': {'$within': {'$box': [[10,20], [20,30]]}}})
We can also search for a circle (specified by center point and radius):
>>> printFind(collection, {'lonlat': {'$within': {'$center': [[0, 0], 2]}}})>>> printFind(collection, {'lonlat': {'$within': {'$center': [[0, 0], 4]}}}) {u'__name__': u'sample', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': [1.0, 3.0], u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'}>>> printFind(collection, {'lonlat': {'$within': {'$center': [[0, 0], 10]}}}) {u'__name__': u'sample', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': [1.0, 3.0], u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'} {u'__name__': u'sample2', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoSample', u'_version': 2, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'lat': 6.0, u'lon': 4.0}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample2'}
Also check if the lat/lon order matters:
>>> printFind(collection, {'lonlat': {'$within': {'$center': [[1, 2], 1]}}}) {u'__name__': u'sample', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': [1.0, 3.0], u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'}>>> printFind(collection, {'lonlat': {'$within': {'$center': [[2, 1], 1]}}})
And check if we can store real lon/lat values by using a float:
>>> data = {'name': u'sample', 'lonlat': {'lon': 20.123, 'lat': 29.123}} >>> sample3 = testing.GeoSample(data) >>> root[u'sample3'] = sample3>>> transaction.commit()>>> printFind(collection, {'lonlat': {'$within': {'$center': [[25, 25], 4]}}})>>> printFind(collection, {'lonlat': {'$within': {'$center': [[25, 25], 10]}}}) {u'__name__': u'sample3', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'lat': 29.123, u'lon': 20.123}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'}
tear down
>>> from m01.mongo import clearThreadLocalCache >>> clearThreadLocalCache()
As you can see our cache items get removed:
>>> from m01.mongo import LOCAL >>> pprint(LOCAL.__dict__) {}
GeoPoint
The GeoPoint item can store a geo location and is used in an item as a kind of sub item providing longitude and latitude and type. Additional to this fields a GeoPoint provides the _m_changed dispatching concept and is able to notify the __parent__ item if lon/lat get changed. The item also provides ILocation for security lookup support. The MongoGeoPointProperty field property is responsible for apply a __parent__ and __name__ and use the right class factory.
The GeoPoint item supports the order longitude, latitude and preserves them.
Condition
Befor we start testing, check if our thread local cache is empty or if we have let over some junk from previous tests:
>>> from m01.mongo.testing import pprint >>> from m01.mongo import LOCAL >>> from m01.mongo.testing import reNormalizer >>> pprint(LOCAL.__dict__) {}
Setup
First import some components:
>>> import datetime >>> import transaction>>> import m01.mongo >>> import m01.mongo.base >>> import m01.mongo.geo >>> import m01.mongo.container >>> from m01.mongo import interfaces >>> from m01.mongo import testing
We also need a application root object. Let’s define a static MongoContainer as our application database root item.
>>> class MongoRoot(m01.mongo.container.MongoContainer): ... """Mongo application root""" ... ... _id = m01.mongo.getObjectId(0) ... ... def __init__(self): ... pass ... ... @property ... def collection(self): ... return testing.getRootItems() ... ... @property ... def cacheKey(self): ... return 'root' ... ... def load(self, data): ... """Load data into the right mongo item.""" ... return testing.GeoPointSample(data) ... ... def __repr__(self): ... return '<%s %s>' % (self.__class__.__name__, self._id)
The following method allows us to generate new MongoRoot item instances. This allows us to show that we generate different root items like we would do on a server restart.
>>> def getRoot(): ... return MongoRoot()
Here is our database root item:
>>> root = getRoot() >>> root <MongoRoot 000000000000000000000000>>>> root._id ObjectId('000000000000000000000000')
indexing
First setup an index:
>>> collection = testing.getRootItems()>>> from pymongo import GEOSPHERE >>> collection.create_index([('lonlat', GEOSPHERE)]) u'lonlat_2dsphere'
GeoPointSample
As you can see, we can initialize a GeoPoint within a list of lon/lat values or within a lon/lat dict:
>>> data = {'name': u'sample', 'lonlat': {'lon': 1, 'lat': 3}} >>> sample = testing.GeoPointSample(data) >>> sample.lonlat <GeoPoint lon:1.0, lat:3.0>>>> data = {'name': u'sample', 'lonlat': [1, 3]} >>> sample = testing.GeoPointSample(data) >>> sample.lonlat <GeoPoint lon:1.0, lat:3.0>>>> root[u'sample'] = sample>>> transaction.commit()
Let’s check our item in Mongo:
>>> data = collection.find_one({'name': 'sample'}) >>> reNormalizer.pprint(data) {u'__name__': u'sample', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoPointSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'coordinates': [1.0, 3.0], u'type': u'Point'}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'}
We can also use a GeoPoint as lonlat data:
>>> geo = m01.mongo.geo.GeoPoint({u'lat': 4, u'lon': 2}) >>> data = {'name': u'sample2', 'lonlat': geo} >>> sample2 = testing.GeoPointSample(data) >>> root[u'sample2'] = sample2>>> transaction.commit()>>> data = collection.find_one({'name': 'sample2'}) >>> reNormalizer.pprint(data) {u'__name__': u'sample2', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoPointSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'coordinates': [2.0, 4.0], u'type': u'Point'}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample2'}
We can also set a GeoPoint as lonlat value:
>>> sample2 = root[u'sample2'] >>> geo = m01.mongo.geo.GeoPoint({'lon': 4, 'lat': 6}) >>> sample2.lonlat = geo>>> transaction.commit()>>> data = collection.find_one({'name': 'sample2'}) >>> reNormalizer.pprint(data) {u'__name__': u'sample2', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoPointSample', u'_version': 2, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'coordinates': [4.0, 6.0], u'type': u'Point'}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample2'}
index
>>> pprint(collection.index_information()) {'_id_': {'key': [('_id', 1)], 'ns': 'm01_mongo_testing.items', 'v': 1}, 'lonlat_2dsphere': {'2dsphereIndexVersion': 2, 'key': [('lonlat', '2dsphere')], 'ns': 'm01_mongo_testing.items', 'v': 1}}
search
Let’s test some geo location search query and make sure our lon/lat order will fit and get preserved during the mongodb roundtrip.
Now seearch for a geo location:
>>> def printFind(collection, query): ... for data in collection.find(query): ... reNormalizer.pprint(data)
Using the geospatial index we can find documents within another point:
>>> point = {"type": "Polygon", ... "coordinates": [[[0,0], [0,6], [2,6], [2,0], [0,0]]]} >>> query = {"lonlat": {"$within": {"$geometry": point}}} >>> printFind(collection, query) {u'__name__': u'sample', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoPointSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'coordinates': [1.0, 3.0], u'type': u'Point'}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'}
Using the geospatial index we can find documents near another point:
>>> point = {'type': 'Point', 'coordinates': [0, 2]} >>> printFind(collection, {'lonlat': {'$near': {'$geometry': point}}}) {u'__name__': u'sample', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoPointSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'coordinates': [1.0, 3.0], u'type': u'Point'}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'} {u'__name__': u'sample2', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoPointSample', u'_version': 2, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'coordinates': [4.0, 6.0], u'type': u'Point'}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample2'}
It’s also possible to query for all items within a given rectangle (specified by lower-left and upper-right coordinates):
>>> printFind(collection, {'lonlat': {'$within': {'$box': [[1,2], [2,3]]}}}) {u'__name__': u'sample', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoPointSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'coordinates': [1.0, 3.0], u'type': u'Point'}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'}
As you can see if we use the wrong order for lon/lat (lat/lon), we will not get a value:
>>> printFind(collection, {'lonlat': {'$within': {'$box': [[2,1], [3,2]]}}})
We can also search for a circle (specified by center point and radius):
>>> printFind(collection, {'lonlat': {'$within': {'$center': [[0, 0], 2]}}})>>> printFind(collection, {'lonlat': {'$within': {'$center': [[0, 0], 4]}}}) {u'__name__': u'sample', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoPointSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'coordinates': [1.0, 3.0], u'type': u'Point'}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'}>>> printFind(collection, {'lonlat': {'$within': {'$center': [[0, 0], 10]}}}) {u'__name__': u'sample', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoPointSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'coordinates': [1.0, 3.0], u'type': u'Point'}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'} {u'__name__': u'sample2', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoPointSample', u'_version': 2, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'coordinates': [4.0, 6.0], u'type': u'Point'}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample2'}
Also check if the lat/lon order matters:
>>> printFind(collection, {'lonlat': {'$within': {'$center': [[1, 2], 1]}}}) {u'__name__': u'sample', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoPointSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'coordinates': [1.0, 3.0], u'type': u'Point'}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'}>>> printFind(collection, {'lonlat': {'$within': {'$center': [[2, 1], 1]}}})
And check if we can store real lon/lat values by using a float:
>>> data = {'name': u'sample', 'lonlat': {'lon': 20.123, 'lat': 29.123}} >>> sample3 = testing.GeoPointSample(data) >>> root[u'sample3'] = sample3>>> transaction.commit()>>> printFind(collection, {'lonlat': {'$within': {'$center': [[25, 25], 4]}}})>>> printFind(collection, {'lonlat': {'$within': {'$center': [[25, 25], 10]}}}) {u'__name__': u'sample3', u'_id': ObjectId('...'), u'_pid': ObjectId('...'), u'_type': u'GeoPointSample', u'_version': 1, u'created': datetime.datetime(..., tzinfo=UTC), u'lonlat': {u'coordinates': [20.123, 29.123], u'type': u'Point'}, u'modified': datetime.datetime(..., tzinfo=UTC), u'name': u'sample'}
tear down
>>> from m01.mongo import clearThreadLocalCache >>> clearThreadLocalCache()
As you can see our cache items get removed:
>>> from m01.mongo import LOCAL >>> pprint(LOCAL.__dict__) {}
Batching
The MongoMappingBase base class used by MongoStorage and MongoContainer can return batched data or items and batch information.
Note; this test runs in level 2 because it uses a working MongoDB. This is needed because we like to test the real sort and limit functions in a MongoDB.
Condition
Befor we start testing, check if our thread local cache is empty or if we have left over some junk from previous tests:
>>> from m01.mongo.testing import pprint >>> from m01.mongo import LOCAL >>> pprint(LOCAL.__dict__) {}
Setup
First import some components:
>>> import datetime >>> import transaction >>> from m01.mongo import testing
setup
Now we can add a MongoStorage to the database. Let’s just use a simple dict as database root:
>>> root = {} >>> storage = testing.SampleStorage() >>> root['storage'] = storage >>> transaction.commit()
Now let’s add 1000 MongoItems:
>>> storage = root['storage'] >>> for i in range(1000): ... data = {u'title': u'Title %i' % i, ... u'description': u'Description %i' % i, ... u'number': i} ... item = testing.SampleStorageItem(data) ... __name__ = storage.add(item)>>> transaction.commit()
After we commited to the MongoDB, the mongo object and our transaction data manger reference should be gone in the thread local cache:
>>> from m01.mongo import LOCAL >>> pprint(LOCAL.__dict__) {}
As you can see, our collection contains 1000 items:
>>> storage = root['storage'] >>> len(storage) 1000
batching
Note, this method does not return items, it only returns the MongoDB data. This is what you should use. If this doesn’t fit because you need a list of the real MongoItem this would be complicated beause we could have removed marked items in our LOCAL cache which the MongoDB doesn’t know about.
Let’s get the batch information:
>>> storage.getBatchData() (<...Cursor object at ...>, 1, 40, 1000)
As you an see, we’ve got a curser with mongo data, the start index, the total amount of items and the page counter. Note, the first page starts at 1 (one) and not zero. Let’s show another ample with different values:
>>> storage.getBatchData(page=5, size=10) (<...Cursor object at ...>, 5, 100, 1000)
As you can see we can iterate our cursor:
>>> cursor, page, total, pages = storage.getBatchData(page=1, size=3)>>> pprint(tuple(cursor)) ({'__name__': '...', '_id': ObjectId('...'), '_pid': None, '_type': 'SampleStorageItem', '_version': 1, 'comments': [], 'created': datetime.datetime(..., tzinfo=UTC), 'date': None, 'description': 'Description ...', 'item': None, 'modified': datetime.datetime(..., tzinfo=UTC), 'number': ..., 'numbers': [], 'title': 'Title ...'}, {'__name__': '...', '_id': ObjectId('...'), '_pid': None, '_type': 'SampleStorageItem', '_version': 1, 'comments': [], 'created': datetime.datetime(..., tzinfo=UTC), 'date': None, 'description': 'Description ...', 'item': None, 'modified': datetime.datetime(..., tzinfo=UTC), 'number': ..., 'numbers': [], 'title': 'Title ...'}, {'__name__': '...', '_id': ObjectId('...'), '_pid': None, '_type': 'SampleStorageItem', '_version': 1, 'comments': [], 'created': datetime.datetime(..., tzinfo=UTC), 'date': None, 'description': 'Description ...', 'item': None, 'modified': datetime.datetime(..., tzinfo=UTC), 'number': ..., 'numbers': [], 'title': 'Title ...'})
As you can see, the cursor counts the total amount of items:
>>> cursor.count() 1000
But we can force to count the result based on limit and skip arguments by use True as argument:
>>> cursor.count(True) 3
As you can see batching or any other object lookup will left items back in our thread local cache. We can use our thread local cache cleanup event handler which is normal registered as an EndRequestEvent subscriber:
>>> from m01.mongo import LOCAL >>> pprint(LOCAL.__dict__) {u'm01_mongo_testing.test...': {'added': {}, 'removed': {}}}
Let’s use our subscriber:
>>> from m01.mongo import clearThreadLocalCache >>> clearThreadLocalCache()
As you can see our cache items get removed:
>>> from m01.mongo import LOCAL >>> pprint(LOCAL.__dict__) {}
order
An important part in batching is ordering. As you can see, we can limit the batch size and get a slice of data from a sequence. It is very important that the data get ordered at the MongoDB before we slice the data into a batch. Let’s test if this works based on our ordable number value and a sort order where lowest value comes first. Start with page=0:
>>> cursor, page, pages, total = storage.getBatchData(page=1, size=3, ... sortName='number', sortOrder=1)>>> cursor <pymongo.cursor.Cursor object at ...>>>> page 1>>> pages 334>>> total 1000
When ordering is done right, the first item should have a number value 0 (zero):
>>> pprint(tuple(cursor)) ({u'__name__': u'...', u'_id': ObjectId('...'), '_pid': None, u'_type': u'SampleStorageItem', u'_version': 1, u'comments': [], u'created': datetime.datetime(..., tzinfo=UTC), 'date': None, u'description': u'Description 0', 'item': None, u'modified': datetime.datetime(..., tzinfo=UTC), u'number': 0, u'numbers': [], u'title': u'Title 0'}, {u'__name__': u'...', u'_id': ObjectId('...'), '_pid': None, u'_type': u'SampleStorageItem', u'_version': 1, u'comments': [], u'created': datetime.datetime(..., tzinfo=UTC), 'date': None, u'description': u'Description 1', 'item': None, u'modified': datetime.datetime(..., tzinfo=UTC), u'number': 1, u'numbers': [], u'title': u'Title 1'}, {u'__name__': u'...', u'_id': ObjectId('...'), '_pid': None, u'_type': u'SampleStorageItem', u'_version': 1, u'comments': [], u'created': datetime.datetime(..., tzinfo=UTC), 'date': None, u'description': u'Description 2', 'item': None, u'modified': datetime.datetime(..., tzinfo=UTC), u'number': 2, u'numbers': [], u'title': u'Title 2'})
The second page (page=1) should start with number == 3:
>>> cursor, page, pages, total = storage.getBatchData(page=2, size=3, ... sortName='number', sortOrder=1) >>> pprint(tuple(cursor)) ({u'__name__': u'...', u'_id': ObjectId('...'), '_pid': None, u'_type': u'SampleStorageItem', u'_version': 1, u'comments': [], u'created': datetime.datetime(..., tzinfo=UTC), 'date': None, u'description': u'Description 3', 'item': None, u'modified': datetime.datetime(..., tzinfo=UTC), u'number': 3, u'numbers': [], u'title': u'Title 3'}, {u'__name__': u'...', u'_id': ObjectId('...'), '_pid': None, u'_type': u'SampleStorageItem', u'_version': 1, u'comments': [], u'created': datetime.datetime(..., tzinfo=UTC), 'date': None, u'description': u'Description 4', 'item': None, u'modified': datetime.datetime(..., tzinfo=UTC), u'number': 4, u'numbers': [], u'title': u'Title 4'}, {u'__name__': u'...', u'_id': ObjectId('...'), '_pid': None, u'_type': u'SampleStorageItem', u'_version': 1, u'comments': [], u'created': datetime.datetime(..., tzinfo=UTC), 'date': None, u'description': u'Description 5', 'item': None, u'modified': datetime.datetime(..., tzinfo=UTC), u'number': 5, u'numbers': [], u'title': u'Title 5'})
As you can see your page size is 334. Let’s show this batch slice. The item in this batch should have a number == 999. but note:
>>> pages 334>>> cursor, page, total, pages = storage.getBatchData(page=334, size=3, ... sortName='number', sortOrder=1) >>> pprint(tuple(cursor)) ({u'__name__': u'...', u'_id': ObjectId('...'), '_pid': None, u'_type': u'SampleStorageItem', u'_version': 1, u'comments': [], u'created': datetime.datetime(..., tzinfo=UTC), 'date': None, u'description': u'Description 999', 'item': None, u'modified': datetime.datetime(..., tzinfo=UTC), u'number': 999, u'numbers': [], u'title': u'Title 999'},)
teardown
Call transaction commit which will cleanup our LOCAL caches:
>>> transaction.commit()
Again, clear thread local cache:
>>> clearThreadLocalCache()
Check our thread local cache before we leave this test:
>>> pprint(LOCAL.__dict__) {}
Testing
Let’s test some testing methods.
>>> import re >>> import datetime >>> import bson.tz_util >>> import m01.mongo >>> import m01.mongo.testing >>> from m01.mongo.testing import pprint
RENormalizer
The RENormalizer is able to normalize text and produce comparable output. You can setup the RENormalizer with a list of input, output expressions. This is usefull if you dump mongodb data which contains dates or other not so simple reproducable data. Such a dump result can get normalized before the unit test will compare the output. Also see zope.testing.renormalizing for the same pattern which is useable as a doctest checker.
>>> normalizer = m01.mongo.testing.RENormalizer([ ... (re.compile('[0-9]*[.][0-9]* seconds'), '... seconds'), ... (re.compile('at 0x[0-9a-f]+'), 'at ...'), ... ])>>> text = """ ... <object object at 0xb7f14438> ... completed in 1.234 seconds. ... ... ... <object object at 0xb7f14450> ... completed in 1.234 seconds. ... """>>> print normalizer(text) <BLANKLINE> <object object at ...> completed in ... seconds. ... <object object at ...> completed in ... seconds. <BLANKLINE>
Now let’s test some mongodb relevant stuff:
>>> from bson.dbref import DBRef >>> from bson.min_key import MinKey >>> from bson.max_key import MaxKey >>> from bson.objectid import ObjectId >>> from bson.timestamp import Timestamp>>> oid = m01.mongo.getObjectId(42) >>> oid ObjectId('0000002a0000000000000000')>>> data = {'oid': oid, ... 'dbref': DBRef("foo", 5, "db"), ... 'date': datetime.datetime(2011, 5, 7, 1, 12), ... 'utc': datetime.datetime(2011, 5, 7, 1, 12, tzinfo=bson.tz_util.utc), ... 'min': MinKey(), ... 'max': MaxKey(), ... 'timestamp': Timestamp(4, 13), ... 're': re.compile("a*b", re.IGNORECASE), ... 'string': 'string', ... 'unicode': u'unicode', ... 'int': 42}
Now let’s pretty print the data:
>>> pprint(data) {'date': datetime.datetime(...), 'dbref': DBRef('foo', 5, 'db'), 'int': 42, 'max': MaxKey(), 'min': MinKey(), 'oid': ObjectId('...'), 're': <_sre.SRE_Pattern object at ...>, 'string': 'string', 'timestamp': Timestamp('...'), 'unicode': 'unicode', 'utc': datetime.datetime(..., tzinfo=UTC)}
reNormalizer
As you can see our predefined reNormalizer will convert the values using our given patterns:
>>> import m01.mongo.testing >>> res = m01.mongo.testing.reNormalizer(data) >>> print res {'date': datetime.datetime(...), 'dbref': DBRef('foo', 5, 'db'), 'int': 42, 'max': MaxKey(), 'min': MinKey(), 'oid': ObjectId('...'), 're': <_sre.SRE_Pattern object at ...>, 'string': 'string', 'timestamp': Timestamp('...'), 'unicode': u'unicode', 'utc': datetime.datetime(..., tzinfo=UTC)}
pprint
>>> m01.mongo.testing.reNormalizer.pprint(data) {'date': datetime.datetime(...), 'dbref': DBRef('foo', 5, 'db'), 'int': 42, 'max': MaxKey(), 'min': MinKey(), 'oid': ObjectId('...'), 're': <_sre.SRE_Pattern object at ...>, 'string': 'string', 'timestamp': Timestamp('...'), 'unicode': u'unicode', 'utc': datetime.datetime(..., tzinfo=UTC)}
UTC
The pymongo library offers a custom UTC implementation including pickle support used by deepcopy. Let’s test if this implementation works and replace our custom timezone with the bson.tz_info.utc:
>>> dt = data['utc'] >>> dt datetime.datetime(2011, 5, 7, 1, 12, tzinfo=UTC)>>> import copy >>> copy.deepcopy(dt) datetime.datetime(2011, 5, 7, 1, 12, tzinfo=UTC)
Speedup your implementation
Since not every strategy is the best for every applications and we can’t implement all concepts in this package, we will list here some imporvements.
values and items
The MongoContainers and MongoStorage implementation will load all data within the values and items methods. Even if we already cached them in our thread local cache. Here is an optimized method which could get used if you need to load a large set of data.
The original implementation of MongoMappingBase.values looks like:
def values(self): # join transaction handling self.ensureTransaction() for data in self.doFind(self.collection): __name__ = data['__name__'] if __name__ in self._cache_removed: # skip removed items continue obj = self._cache_loaded.get(__name__) if obj is None: try: # load, locate and cache if not cached obj = self.doLoad(data) except (KeyError, TypeError): continue yield obj # also return items not stored in MongoDB yet for k, v in self._cache_added.items(): yield v
If you like to prevent loading all data, you could probably only load keys and lookup data for items which didn’t get cached yet. This would reduce network traffic and could look like:
def values(self): # join transaction handling self.ensureTransaction() # only get __name__ and _id for data in self.doFind(self.collection, {}, ['__name__', '_id']): __name__ = data['__name__'] if __name__ in self._cache_removed: # skip removed items continue obj = self._cache_loaded.get(__name__) if obj is None: try: # now we can load data from mongo d = self.doFindOne(self.collection, data) # load, locate and cache if not cached obj = self.doLoad(d) except (KeyError, TypeError): continue yield obj # also return items not stored in MongoDB yet for k, v in self._cache_added.items(): yield v
Note: the same concept can get used for the items method.
Note: I don’t recommend to call keys, values or items for large collections at any time. Take a look at the batching concept we implemented. The getBatchData method is probably what you need to use with a large set of data.
AdvancedConverter
The class below shows an advanced implementation which is able to convert a nested data structure.
Normaly a converter can convert attribute values. If the attribute value is a list of items which contains another list of items, then you need to use another converter which is able to convert this nested structure. But normaly this is the responsibility of the first level item to convert it’s values. This is the reason why we didn’t implement this concept by default.
Remember, a default converter definition looks like:
def itemConverter(value): _type = value.get('_type') if _type == 'Car': return Car if _type == 'House': return House else: return value
And the class defines something like:
converters = {'myItems': itemConverter}
Our advanced converter sample can convert a nested data structure and looks like:
def toCar(value): return Car(value) converters = {'myItems': {'House': toHouse, 'Car': toCar}} class AdvancedConverter(object): converters = {} # attr-name/converter or {_type:converter} def convert(self, key, value): """This convert method knows how to handle nested converters.""" converter = self.converters.get(key) if converter is not None: if isinstance(converter, dict): if isinstance(value, (list, tuple)): res = [] for o in value: if isinstance(o, dict): _type = o.get('_type') if _type is not None: converter = converter.get(_type) value = converter(o) res.append(value) value = res elif isinstance(value, dict): _type = o.get('_type') if _type is not None: converter = converter.get(_type) value = converter(value) else: value = converter(value) else: if isinstance(value, (list, tuple)): # convert list values value = [converter(d) for d in value] else: # convert simple values value = converter(value) return value
I’m sure if you understand what we implemented, you will find a lot of space to improve and write your own special methods which can do the right thing for your use cases.
CHANGES
3.3.5 (2024-01-09)
bugfix: adjust MONGODB_TLS_CERTIFICATE_SELECTOR setup. Was incorrect using the MONGODB_TLS_ALLOW_INVALID_HOSTNAME option.
added hook for fix datetime now method. Use hook for setup created and modified date. This allows us to setup a date hook used for a complex testing setup.
3.3.4 (2021-03-26)
bugfix: removed _version and ICreatedModified from IMongoSubObject interface. _version, modified and create are not allways relevant for sub objects beacuse this information is available in the parent object. Feel free to suport this additional information in your own implementation.
3.3.3 (2021-03-23)
feature: implemented MongoSubObject object and a MongoSubObjectProperty. The MongoSubObjectProperty provides a converter and factory and can get used for apply an object as attribute. This allows to traverse the object within the attriute name. Compared to the MongoObject whcih stores the data in an own collection, the MongoSubobject implementation stores it’s data in the parent object.
3.3.2 (2021-01-14)
added TLS options for pymongo client setup
bugfix: fix order method in MongoItemsData compare with set. The existing implementation was using pop with values instead of indexes for validate the new order names.
added tests for bugfix
3.3.1 (2020-04-22)
bugfix: register MongoListData class and allow interface IMongoListData. This allows to access the internal implementation like a simply built in type. Note: the object property using this implementation is still protected. We just let our instance act like a buit in simply python type.
3.3.0 (2018-02-04)
use new p01.env package for pymongo client environment setup
3.2.3 (2018-02-04)
bugfix: removed FakeMongoConnectionPool from mongo client testing setup
set MONGODB_CONNECT to False as default because client setup takes too long for testing setup. Add MONGODB_CONNECT to your os environment if you need to connect on application startup.
3.2.2 (2018-01-29)
bugfix: fix timeout milli seconds and MONGODB_REVOCATION_LIST attr usage
3.2.1 (2018-01-29)
bugfix: multiply MONGODB_SERVER_SELECTION_TIMEOUT with 1000because it’s used as milli seconds
3.2.0 (2018-01-29)
feature: implemented pymongo client setup based on enviroment variables and default settings.py file
3.1.0 (2017-01-22)
bugfix: make sure we override existing mongodb values with None if None is given as value in python object. Previous versions didn’t override existing values with None. The new implementation will use the default schema value as mongodb value even if default is None. Note, this will break existing test output.
bugfix: fix performance test setup, conditional include ZODB for performance tests. Supported with extras_require in setup.py.
3.0.0 (2015-11-11)
Use 3.0.0 as package version and reflect pymongo > 3.0.0 compatibility.
feature: change internal doFind, doInsert and doRemove methods, remove old method arguments like safe etc..
feature: reflect changes in pymongo > 3.0.0. Replace disconnect with close method like the MongoClient does.
removed MongoConnectionPool, replace them with MongoClient in your code. There is no need for a thread safe connection pool since pymongo is thread safe. Also replace MongoConnection with MongoClient in your test code.
switch from m01.mongofake to m01.fake including pymongo >= 3.0.0 support
remove write_concern options in mapping base class. The MongoClient should define the right write concern.
1.0.0 (2015-03-17)
improve AttributeError handling on object setup. Additional catch ValueError and zope.interface.Invalid and raise AttributeError with detailed attribute and value information
0.11.1 (2014-04-10)
feature: changed mongo client max_pool_size value from 10MB to 100MB which reflects changes in pymongo >= 2.6.
0.11.0 (2013-1-23)
implement GeoPoint used for 2dsphere geo location indexes. Also provide a MongoGeoPointProperty which is able to create such GeoPoint items.
0.10.2 (2013-01-04)
support _m_insert_write_concern, _m_update_write_concern, _m_remove_write_concern in MongoObject
0.10.1 (2012-12-19)
feature: implemented MongoDatetime schema field supporting timezone info attribute (tzinfo=UTC).
0.10.0 (2012-12-16)
switch from Connection to MongoClient recommended since pymongo 2.4. Replaced safe with write concern options. By default pymongo will now use safe writes.
use MongoClient as factory in MongoConnectionPool. We didn’t rename the class MongoConnectionPool, we will keep them as is. We also don’t rename the IMongoConnectionPool interface.
replaced _m_safe_insert, _m_safe_update, _m_safe_remove with _m_insert_write_concern, _m_update_write_concern, _m_remove_write_concern. This new mapping base class options are an empty dict and can get replaced with the new write concern settings. The default empty dict will force to use the write concern defined in the connection.
0.9.0 (2012-12-10)
use m01.mongofake for fake mongodb, collection and friends
0.8.0 (2012-11-18)
bugfix: add missing security declaration for dump data
switch to bson import
reflect changes in test output based on pymongo 2.3
remove p01.i18n package dependency
improve, prevent mark items as changed for same values
improve sort, support key or list as sortName and allow to skip sortOrder if sortName is given
added MANIFEST.in file
0.7.0 (2012-05-22)
bugfix: FakeCollection.remove: use find to find documents
preserve order by using SON for query filter and dump methods
implemented m01.mongo.dictify which can recoursive replace all bson.son.SON with plain dict instances.
0.6.2 (2012-03-12)
bugfix: left out a method
0.6.1 (2012-03-12)
bugfix: return self in FakeMongoConnection __call__method. This let’s an instance act similar then the original pymongo Connection class __init__ method.
feature: Add sort parameter for FakeMongoConnection.find()
0.6.0 (2012-01-17)
bugfix: During a query, if a spec key is missing from the doc, the doc is always ignored.
bugfix: correctly generate an object id in UTC. It was relying on GMT+1 (i.e. Roger’s timezone).
bugfix: allow to use None as MongoDateProperty value
bugfix: set __parent__ in MongoSubItem __init__ method if given
implemented _m_initialized as a marker for find out when we need to trace changed attributes
implemented clear method in MongoListData and MongoItemsData which allows to remove sequence items at once wihout to pop each item from the sequence
improve MongoObject implementation, implemented _field which stores the parent field name which the MongoObject is stored at. Also adjsut the MongoObjectProperty and support backward compatibility by apply the previous stored __name__ as _field if not given. This new _field and __name__ separation allos us to use explicit names e.g. the _id or custom names which we can use for traversing to a MongoObject via traverser or other container like implementations.
Implemented __getattr__ in FakeCollection. This allows to get a sub collection like in pymongo which is a part of the gridfs concept.
0.5.5 (2011-10-14)
Implement filtering with dot notation
0.5.4 (2011-09-27)
Fix: a real mongo DB accepts tuple as the fields parameter of find.
0.5.3 (2011-09-20)
Fix minimum filtering expressions (Albertas)
0.5.2 (2011-09-19)
Added minimum filtering expressions (Albertas)
moved created and modified to an own interface called ICreatedModified
implemented simple and generic initial geo location support
0.5.1 (2011-09-09)
fix performance test
Added database_names and collection_names
0.5.0 (2011-08-19)
initial release
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file m01.mongo-3.3.5.tar.gz
.
File metadata
- Download URL: m01.mongo-3.3.5.tar.gz
- Upload date:
- Size: 104.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: Python-urllib/2.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 12d2b7bb36ae6dcec3a6d3d83e5faa1b6f3e89124b785ad021ee98c6b7030b7f |
|
MD5 | bd036606b5e14beef57bddaba0c44621 |
|
BLAKE2b-256 | 25d6ba1a5d4ac1f9b205c4c8cd855c0e122ce3a77ec44c78d586cb09f1ee36f4 |