5.2 The Database System

The file handlers as we have seen so far are not a attached to a database:

    >>> from itools.handlers import get_handler
    >>>
    >>> file = get_handler('itools.pdf')
    >>> print file.database
    None

In this section we are going to see the database system for file handlers, which adds some nice features: caching and transactions.

    >>> from itools.handlers import Database
    >>> 
    >>> db = Database()
    >>> file = db.get_handler('itools.pdf')
    >>> print file.database
    <itools.handlers.database.Database object at 0x2b138fde6910>

5.2.1 Caching

The get_handler function does not support caching, every time it is called it will create a new handler:

    >>> get_handler('itools.pdf')
    <itools.handlers.file.File object at 0x2b1392fdd590>
    >>> get_handler('itools.pdf')
    <itools.handlers.file.File object at 0x2b1392fdd550>

But with the database, we get always the same file handler, because it is stored in the cache:

    >>> db.get_handler('itools.pdf')
    <itools.handlers.file.File object at 0x2b1392fdd510>
    >>> db.get_handler('itools.pdf')
    <itools.handlers.file.File object at 0x2b1392fdd510>

We can inspect the cache:

    >>> for key in db.cache:
    ...     print key
    ...     print db.cache[key]
    ...     print
    ... 
    file:///home/jdavid/sandboxes/itools-docs/itools.pdf
    <itools.handlers.file.File object at 0x2b1392fdd510>

The cache is just a mapping from URI to file handler. Because the key is a URI, we can keep in the database remote handlers.

5.2.2 Programming Interface

This is the programming interface provided by the database:

All modification methods do the changes in-memory. Changes can be later aborted or saved. This makes up transaction. Section 5.2.4 explains the details.

5.2.3 Folders

All the itools.handlers package is about files, not folders. Files are the things that contain data, folders are there just to simplify our lives.

When the get_handler method is called for a folder resource, a folder handler is returned:

    >>> db.get_handler('/tmp')
    <itools.handlers.folder.Folder object at 0x2b1392fdd690>
    >>> db.get_handler('/tmp')
    <itools.handlers.folder.Folder object at 0x2b1392fdd5d0>

First difference with file handlers: folders are not cached. Every time we ask for a folder resource, a different handler will be returned. Since folders don’t keep any data, there is no point to cache them. And the lack of state means they do not have the timestamp and dirty variables either.

Folders are just a URI in a database:

    >>> tmp = db.get_handler('/tmp')
    >>> print tmp.database
    <itools.handlers.database.Database object at 0x2afa17af4910>
    >>> print tmp.uri
    file:///tmp

The folder’s API is basically the same of the database’s API we have seen in Section 5.2.2. The difference is that with the database API relative URI references are resolved against the current working directory; while with folders they are resolved against the folder’s URI.

So these lines are equivalent:

    # Database: URI references relative to working directory
    >>> print db.has_handler('/tmp/test.txt')
    False
    # Folder: URI references relative to folder's uri
    >>> print tmp.has_handler('test.txt')
    False

5.2.4 Transactions

As explained above changes done to the database are kept in memory, so they can later be aborted or saved. This makes-up a transaction:

    >>> from itools.handlers import TextFile
    >>>
    # Create a new file
    >>> test = TextFile()
    >>> test.set_data(u'hello world\n')
    # Add the new file
    >>> tmp.set_handler('test.txt', test)
    >>> print tmp.has_handler('test.txt')
    True
    # Copy the file
    >>> tmp.copy_handler('test.txt', 'test2.txt')
    >>> copy = tmp.get_handler('test2.txt')
    # Modify the first file
    >>> test.set_data(u'First post\n')
    # Check the files content
    >>> print test.data
    First post

    >>> print copy.data
    hello world

If you check the file system, you will see there is not any file named test.txt or test2.txt in the temporary folder. Reached this point you can either abort the changes:

    >>> db.abort_changes()
    >>> print tmp.has_handler('test.txt')
    False
    >>> print tmp.has_handler('test2.txt')
    False

Or save them:

    >>> db.save_changes()
    >>> print tmp.has_handler('test.txt')
    True
    >>> print tmp.has_handler('test2.txt')
    True

The programming interface for transactions is pretty simple: