z3c.sharedmimeinfo 0.1.0
pip install z3c.sharedmimeinfo
Released:
MIME type guessing framework for Zope, based on shared-mime-info
Navigation
Verified details
These details have been verified by PyPIMaintainers
agroszer baijum davisagli faassen gary hannosch J1m menesis nadako projekt01 srichter tlotzeUnverified details
These details have not been verified by PyPIProject links
Meta
- License: ZPL 2.1
- Author: Dan Korostelev and Zope Community
Project description
z3c.sharedmimeinfo
This package provides an utility for guessing MIME type from file name and/or actual contents. It’s based on freedesktop.org’s shared-mime-info database.
Shared MIME info database
The shared-mime-info is a extensible database of common MIME types. It provides powerful MIME type detection mechanism as well as multi-lingual type descriptions.
This package requires shared-mime-info to be installed and accessible. The easiest way to do that is to install it system-wide, for example installing the shared-mime-info package on Ubuntu. The specification also describes other ways to install and extend the database.
Thread-safety
Note, that this package is currently not thread-safe, because data are meant to be loaded only once, on module import. If there will be any problems because of that, it could be changed in future.
MIME type guessing
The easiest way to use this package is to import the getType function from the root module:
>>> from z3c.sharedmimeinfo import getType
This function tries to guess the MIME type as specified in shared-mime-info specification document and always returns some usable MIME type, using application/octet-stream or text/plain as fallback. It can detect MIME type by file name, its contents or both, so it accepts two arguments: filename (string) and/or file (file-like object). At least one of them should be given.
As said above, it needs at least one argument, so you can’t call it with no arguments:
>>> getType() Traceback (most recent call last): ... TypeError: Either filename or file should be provided or both of them
Passing file name is done via the filename argument:
>>> print getType(filename='document.doc') application/msword
Passing file contents is done via file argument, which accepts a file-like object. Let’s use our testing helper function to open a sample file and try to guess a type for it:
>>> print getType(file=openSample('png')) image/png
If the MIME type cannot be detected, either text/plain or application/octet-stream will be returned. The function will try to guess is it text or binary by checking the first 32 bytes:
>>> print getType(filename='somefile', file=openSample('text')) text/plain >>> print getType(filename='somefile', file=openSample('binary')) application/octet-stream
MIME type objects
Objects returned by getType and other functions (see below) are actually an extended unicode string objects, providing additional info about the MIME type. They provide the IMIMEType interface:
>>> from zope.interface.verify import verifyObject >>> from z3c.sharedmimeinfo.interfaces import IMIMEType >>> mt = getType(filename='document.doc') >>> verifyObject(IMIMEType, mt) True
As they are actually unicode objects, they can be compared like strings:
>>> mt == 'application/msword' True
They also provides the media and subtype attributes:
>>> mt.media u'application' >>> mt.subtype u'msword'
And finally, they provide the title attribute that is a translatable message:
>>> mt.title u'application/msword' >>> from zope.i18nmessageid.message import Message >>> isinstance(mt.title, Message) True
Let’s check the i18n features that comes with shared-mime-info and are supported by this package. As seen above, the MIME type title message ID is actually its <media>/<subtype>, but if we translate it, we’ll get a human-friendly string:
>>> from zope.i18n import translate >>> translate(mt.title) u'Word document' >>> translate(mt.title, target_language='ru') u'\u0434\u043e\u043a\u0443\u043c\u0435\u043d\u0442 Word' >>> from z3c.sharedmimeinfo.mimetype import MIMEType
We can also create IMIMEType objects by hand, using the MIMEType class:
>>> from z3c.sharedmimeinfo.mimetype import MIMEType
We can create them specifying media and subtype as two arguments or as a single argument in the “media/subtype” form:
>>> MIMEType('text/plain') <MIMEType text/plain> >>> MIMEType('image', 'png') <MIMEType image/png>
Note, that the MIMEType objects are cached, so if you you’ll create another object for the same mime type, you’ll get the same object:
>>> mt = MIMEType('text/plain') >>> mt2 = MIMEType('text/plain') >>> mt2 is mt True
Advanced usage
The getType function, described above is actually a method of the IMIMETypesUtility object. The IMIMETypesUtility is a core component for guessing MIME types.
Let’s import the utility directly and play with it:
>>> from z3c.sharedmimeinfo.utility import mimeTypesUtility >>> from z3c.sharedmimeinfo.interfaces import IMIMETypesUtility >>> verifyObject(IMIMETypesUtility, mimeTypesUtility) True
It has three methods for getting mime type. Those three methods are getType (described above), getTypeByFileName, getTypeByContents.
Detection by file name
The getTypeByFileName method of the MIME types utility looks up the type by filename:
>>> mt = mimeTypesUtility.getTypeByFileName('example.doc')
shared-mime-info database is really nice, it can even detect mime type for file names like Makefile:
>>> print mimeTypesUtility.getTypeByFileName('Makefile') text/x-makefile
Also, it know the difference in extension letter case. For example the .C should be detected as C++ file, when .c is plain C file:
>>> print mimeTypesUtility.getTypeByFileName('hello.C') text/x-c++src >>> print mimeTypesUtility.getTypeByFileName('main.c') text/x-csrc
The method will return None if it fails determining type from file name:
>>> print mimeTypesUtility.getTypeByFileName('somefilename') None
Detection by contents
The getTypeByContents method accepts a file-like object and two optional arguments: min_priority and max_priority that can be used to specify the range of “magic” rules to be used. By default, min_priority is 0 and max_priority is 100, so all rules will be in use. See shared-mime-info specification for details.
We have some sample files that should be detected by contents:
>>> fdoc = openSample('doc') >>> print mimeTypesUtility.getTypeByContents(fdoc) application/msword >>> fhtml = openSample('html') >>> print mimeTypesUtility.getTypeByContents(fhtml) text/html >>> fpdf = openSample('pdf') >>> print mimeTypesUtility.getTypeByContents(fpdf) application/pdf >>> fpng = openSample('png') >>> print mimeTypesUtility.getTypeByContents(fpng) image/png
If we pass the file without any known magic bytes, it will return None:
>>> funknown = openSample('binary') >>> print mimeTypesUtility.getTypeByContents(funknown) None >>> del fdoc, fhtml, fpdf, fpng, funknown
CHANGES
0.1.0 (2009-09-08)
Initial release.
Project details
Verified details
These details have been verified by PyPIMaintainers
agroszer baijum davisagli faassen gary hannosch J1m menesis nadako projekt01 srichter tlotzeUnverified details
These details have not been verified by PyPIProject links
Meta
- License: ZPL 2.1
- Author: Dan Korostelev and Zope Community
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file z3c.sharedmimeinfo-0.1.0.tar.gz
.
File metadata
- Download URL: z3c.sharedmimeinfo-0.1.0.tar.gz
- Upload date:
- Size: 29.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7c982338a9ef38e3c6bfef10787dff9aacec3a288c5ca75317483a8de41d30cd |
|
MD5 | 9f40c6aed4e14a0c7f13d9cc9f39f89e |
|
BLAKE2b-256 | 5ccb8e4a885bd7107d804a2ca05b53d85bcccef6791a0daddc72d585266c7943 |