Skip to main content

A framework for indexing and querying the ZODB

Project description

ObjectQuery

ObjectQuery is licensed under ZPL 2.1.

Copyright 2007-2009 by gocept gmbh & co. kg.

ObjectQuery enables you to query for persistent objects (e.g. objects in the ZODB). This is done with a XPath-like language named Regular Path Expressions (RPE). ObjectQuery also includes indexstructures for performance reasons.

It uses SimpleParse to parse the RPE-query.

Please report bugs to gocept project portal.

Querying objects with regular path expressions

Initialization

First load the test database. For more information about that, please have a look inside objects.py.

>>> from gocept.objectquery.tests.objects import *
>>> import ZODB.MappingStorage
>>> import ZODB
>>> from gocept.objectquery.collection import ObjectCollection
>>> storage = ZODB.MappingStorage.MappingStorage()
>>> db = ZODB.DB(storage)
>>> conn = db.open()
>>> dbroot = conn.root()
>>> dbroot['_oq_collection'] = objects = ObjectCollection(conn)
>>> import transaction
>>> import gocept.objectquery.indexsupport
>>> index_synch = gocept.objectquery.indexsupport.IndexSynchronizer()
>>> transaction.manager.registerSynch(index_synch)
>>> p_orwell = Person(name="George Orwell")
>>> p_lotze = Person(name="Thomas Lotze")
>>> p_goethe = Person(name="Johann Wolfgang von Goethe")
>>> p_weitershausen = Person(name="Philipp von Weitershausen")
>>> b_1984 = Book(author=p_orwell,
...               title="1984",
...               written=1990,
...               isbn=3548234100)
>>> b_plone = Book(author=p_lotze,
...                title="Plone-Benutzerhandbuch",
...                written=2008,
...                isbn=3939471038)
>>> b_faust = Book(author=p_goethe,
...                title="Faust",
...                written=1811,
...                isbn=3406552501)
>>> b_farm = Book(author=p_orwell,
...               title="Farm der Tiere",
...               written=2002,
...               isbn=3257201184)
>>> b_zope = Book(author=p_weitershausen,
...               title="Web Component Development with Zope 3",
...               written=2007,
...               isbn=3540338071)
>>> l_halle = Library(location="Halle",
...                   books=[b_1984, b_plone, b_farm, b_zope])
>>> l_berlin = Library(location="Berlin",
...                    books=[b_1984, b_plone, b_faust, b_farm, b_zope])
>>> l_chester = Library(location="Chester",
...                     books=[b_1984, b_faust, b_farm])
>>> dbroot['librarydb'] = persistent.list.PersistentList()
>>> dbroot['librarydb'].extend([l_halle, l_berlin, l_chester])
>>> librarydb = dbroot['librarydb']
>>> transaction.commit()
>>> from pprint import pprint

Create QueryProcessor and initialize the ObjectCollection

You create a QueryProcessor like this:

>>> from gocept.objectquery.pathexpressions import RPEQueryParser
>>> from gocept.objectquery.processor import QueryProcessor
>>> parser = RPEQueryParser()
>>> query = QueryProcessor(parser, objects)
>>> query
<gocept.objectquery.processor.QueryProcessor object at 0x...>

Some example usecases

Root joins:

>>> r = query('/PersistentList/Library')
>>> sorted(elem.location for elem in r)
['Berlin', 'Chester', 'Halle']

Search for the authors of all Books named “Faust”:

>>> r = query('/PersistentList/Library/Book[@title="Faust"]/Person')
>>> sorted(elem.name for elem in r)
['Johann Wolfgang von Goethe']

Search for all books written after year 2000:

>>> r = query('/PersistentList/Library/Book[@written>=2000]')
>>> len(r)
3
>>> pprint(sorted(elem.title for elem in r))
['Farm der Tiere',
 'Plone-Benutzerhandbuch',
 'Web Component Development with Zope 3']

Search for all authors of books written after year 2000:

>>> r = query('/PersistentList/Library/Book[@written>=2000]/Person')
>>> len(r)
3
>>> pprint(sorted(elem.name for elem in r))
['George Orwell', 'Philipp von Weitershausen', 'Thomas Lotze']

Search for all Books, that have are located in Halle and have been written in 2007:

>>> r = query('/PersistentList/Library[@location="Halle"]/Book[@written==2007]')
>>> sorted((elem.title, elem.isbn) for elem in r)
[('Web Component Development with Zope 3', 3540338071L)]

Handle Wildcards correctly:

>>> r = query('/PersistentList/Library/_/Person')
>>> pprint(sorted(elem.name for elem in r))
['George Orwell',
 'Johann Wolfgang von Goethe',
 'Philipp von Weitershausen',
 'Thomas Lotze']

Instead of only providing the classname, it is also possible to provide the class with its full module:

>>> r = query('/PersistentList/gocept.objectquery.tests.objects.Library')
>>> pprint([library.location for library in r])
['Halle', 'Berlin', 'Chester']
>>> query('/PersistentList/gocept.objectquery.tests.objects2.Library')
[]

What about precedence:

>>> r = query('/PersistentList/Library[@location="Halle"]/Book/Person')
>>> pprint(sorted(elem.name for elem in r))
['George Orwell', 'Philipp von Weitershausen', 'Thomas Lotze']
>>> r = query('(/PersistentList/Library[@location="Halle"]/Book)/Person')
>>> pprint(sorted(elem.name for elem in r))
['George Orwell', 'Philipp von Weitershausen', 'Thomas Lotze']
>>> r = query('/PersistentList/Library[@location="Halle"]/(Book/Person)')
>>> len(r)
0
>>> r = query('(/PersistentList/Library/Book[@title="Faust"])/(Book/Person)')
>>> sorted(elem.name for elem in r)
['Johann Wolfgang von Goethe']

But pay attention. If you change the query from ..”]/(Book.. to ..”](/Book.. you get an Library-Result with location in “Halle”. This is, because the subquery (in brakets) returns no results:

>>> r = query('/PersistentList/Library[@location="Halle"](/Book/Person)')
>>> len(r)
1
>>> r[0].location
'Halle'

Unions:

>>> r = query('(/PersistentList/Library[@location="Halle"])|(Book/Person)')
>>> len(r)
5
>>> pprint(sorted(elem for elem in r))
[<gocept.objectquery.tests.objects.Library object at 0x...>,
 <gocept.objectquery.tests.objects.Person object at 0x...>,
 <gocept.objectquery.tests.objects.Person object at 0x...>,
 <gocept.objectquery.tests.objects.Person object at 0x...>,
 <gocept.objectquery.tests.objects.Person object at 0x...>]
>>> r = query('(/PersistentList/Library)|(Book[@written=1990])')
>>> len(r)
4
>>> pprint(sorted(elem for elem in r))
[<gocept.objectquery.tests.objects.Book object at 0x...>,
 <gocept.objectquery.tests.objects.Library object at 0x...>,
 <gocept.objectquery.tests.objects.Library object at 0x...>,
 <gocept.objectquery.tests.objects.Library object at 0x...>]
>>> transaction.commit()

Kleene Closure

First we need a new database:

>>> doc1 = Document()
>>> doc2 = Document()
>>> doc3 = Document()
>>> fol4 = Folder([doc2])
>>> fol3 = Folder([doc1])
>>> fol2 = Folder([fol3])
>>> fol1 = Folder([fol2])
>>> plo1 = Plone([fol1, fol4, doc3])
>>> root = Root([plo1])
>>> dbroot['test'] = root
>>> transaction.commit()

Now there should be one Plone object under root:

>>> r = query('/Root/Plone')
>>> len(r) == 1 and r[0] == plo1
True
>>> r = query('/Root/Plone/Folder/Document')
>>> len(r)
1
>>> r[0] == doc2
True

Get all Documents which are under any number of Folders:

>>> r = query('/Root/Plone/Folder*/Document')
>>> r[0] != r[1] != r[2] and isinstance(r[0], Document)
True
>>> r = query('Plone/Folder*/Document')
>>> r[0] != r[1] != r[2] and isinstance(r[0], Document)
True
>>> r = query('Folder*/Document')
>>> r[0] != r[1] != r[2] and isinstance(r[0], Document)
True

Get all Documents which are under one or zero number of Folders:

>>> r = query('/Root/Plone/Folder?/Document')
>>> len(r) == 2 and (r[0] == doc2 or r[1] == doc2) and (r[0] == doc3 or r[1] == doc3) and r[0] != r[1]
True
>>> r = query('Folder?/Document')
>>> len(r) == 3 and r[0] != r[1] != r[2]
True

Get all Documents which are under one or more number of Folders:

>>> r = query('/Root/Plone/Folder+/Document')
>>> len(r) == 2 and (r[0] == doc1 or r[1] == doc1) and (r[0] == doc2 or r[1] == doc2) and r[0] != r[1]
True
>>> r = query('Folder+/Document')
>>> len(r) == 2 and (r[0] == doc1 or r[1] == doc1) and (r[0] == doc2 or r[1] == doc2) and r[0] != r[1]
True

You may also query absolute path lengths:

>>> len(query('Plone/Document'))
1
>>> len(query('Plone/Folder/Document'))
1
>>> len(query('Plone/Folder/Folder/Document'))
0
>>> len(query('Plone/Folder/Folder/Folder/Document'))
1

Furthermore, it is possible to query all Documents, which are located under 2 or more Folders:

>>> r = query('Plone/Folder+/Folder/Document')
>>> len(r) == 1 and r[0] == doc1
True
>>> r = query('Plone/Folder/Folder+/Document')
>>> len(r) == 1 and r[0] == doc1
True

A special case is the combination of wildcard and ‘*’ closure:

>>> r = query('Plone/_*/Document')
>>> len(r) == 3
True

CHANGES

0.1b1 (2009-08-13)

  • Add support for windows by adding a SimpleParse egg.

  • Use sw.objectinspection instead of ObjectParser to inspect objects for attributes and children. This brings much more flexibility in inspecting custom objects.

0.1b (2009-07-23)

  • Small API refactorings (#5780)

  • Add support for querying for classes of a given module (#5778).

  • Add support for querying for base classes of objects (#4880).

0.1a2 (2009-06-17)

  • Better handling of unpersistent objects.

0.1a1 (2009-06-05)

  • Stop ignoring callable objects (e.g. a Plone site) for indexing, just once ignore methods.

  • Do not break during indexing if added object is not added to the ZODB (doesn’t have the _p_oid attribute). Those objects are ignored for now and not added to the index structures.

  • Add rindex method for adding objects to the collection recursively.

  • Add SimpleParse as a 3rdparty egg because it can’t be retrieved from pypi for some months now.

0.1pre (2009-02-04)

  • first alpha release

0.1pre (2008-08-19)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gocept.objectquery-0.1b1.tar.gz (793.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page