Turn external feed entries into content items
Project description
Feedfeeder
Feedfeeder has just a few things it needs to do:
Read in a few ATOM feeds (not too many).
Create FeedFeederItems out of the entries pulled from the ATOM feeds. Any feed items that contain enclosures will have the enclosures pulled down and added as File items to the feed item.
This means figuring out which items are new, which also means having a good ID generating mechanism.
Wait, no existing product?
There’s a whole slew of RSS/ATOM reading products for zope and plone. None of them seemed to be a good fit. There was only one product that actually stored the entries in the zope database, but that was aimed at a lot of users individually adding a lot of feeds, so it needed either a separate ZEO process (old version) or a standalone mysql database (new version).
All the other products didn’t store the entries in the database, were old/unmaintained/etc.
In a sense, we’re using an existing product as we use Mark Pilgrim’s excellent feedparser (http://feedparser.org) that’ll do the actual ATOM reading for us.
Product name
The product feeds the content of ATOM feeds to plone as document/file content types. So “feedfeeder” sort of suggested itself as a funny name. Fun is important :-)
Product structure
I’m using archgenxml to generate the boiler plate stuff. There’s a ‘generate.sh’ shell script that’ll call archgenxml for you. Nothing fancy.
- The feedfeeder’s content types are:
folder.FeedfeederFolder
item.FeedFeederItem
How it works
A feedfeeder is a folder which contains all the previously-added feed entries as documents or files. It has a ‘feeds’ attribute that contains a list of feeds to read.
Feedparser is called periodically (through a cron job?) to parse the feeds. The UID of the items in the feed are converted to a suitable filename (md5 hex hash of the atom id of the entry), that way you can detect whether there are new items.
New items are turned into feed items. Feed data are filled into feed items (see field named objectInfo).
Scheduled updates for feed folders
Zope can be configured to periodically trigger a url call. In zope.conf you can use the <clock-server> directive to define a schedule and url with the following data:
<clock-server> method /path_to_feedfolder/update_feed_items period 3600 # seconds user admin password 123 host localhost:8080 </clock-server>
Updating all feeds once
If your site has several feed folders and you want update them all once you can do:
<clock-server> method /yoursiteid/feed-mega-update period 3600 # seconds user admin password 123 host localhost:8080 </clock-server>
Dependencies
Targeted at Plone 3.3 or Plone 4 (might work on earlier Plone 3 versions, but this is not tested).
Tests
The look-here-first test is the doctest at ‘doc/feedfeeder-integration.txt’.
Assuming you have a buildout, testing is best done something like this:
bin/instance test -s Products.feedfeeder
or if you have a bin/test command properly set up:
bin/test -s Products.feedfeeder
History of feedfeeder
2.0.5 (2011-09-03)
Use feed-item.pt on Plone 4, filling the content-core slot, and feed-item3.pt on Plone 3, filling the body slot as before. Fixes http://plone.org/products/feedfeeder/issues/36 [maurits]
Register our own documentbyline viewlet for feed items, which displays the feed item author as creator. Refs http://plone.org/products/feedfeeder/issues/36 [Maurits]
Fixed possible UnicodeDecodeError when updating feed items. Refs http://plone.org/products/feedfeeder/issues/37 [maurits]
Fixed Plone 4.1 compatibility [iElectric]
2.0.4 (2011-03-24)
Avoid DeprecationWarning on python2.6 by preferring hashlib over md5 when available. [maurits]
Do not reindex the feed item when nothing has changed. Only update the objectInfo field when there has been a change. Fixes http://plone.org/products/feedfeeder/issues/34 [maurits]
2.0.3 (2011-01-17)
Respect the Plone setting on the ‘about’ information: only show the document byline if the user is logged in or anonymous users are allowed to view the about information. [markvl]
2.0.2 (2010-12-17)
Modified import RSS and added a new field on feed items named objectInfo. All feed data will be stored on this field, as a python dict. Just changing the remote RSS template, you will able to memoize additional info without having to modify the feed item schema. [dmoro]
Added an option on feed folder that let you choose to redirect automatically to remote resources. If you have modify permissions on feed items there will not be any redirect [dmoro]
Added new tests [sithmel]
2.0.1 (2010-11-26)
Added @@feed-mega-update view so you can update all feed folders at once, for example in a clock server. [miohtoma]
Import HTMLParseError from the standard python HTMLParser instead of BeautifulSoup. This makes feedfeeder compatible with BeautifulSoup 3.0.x again. [maurits]
2.0 (2010-07-05)
Solve some Plone 4 compatibility issues. [sureshvv]
Ignore unidentifiable entries without id or link, instead of throwing an AttributeError. Fixes http://plone.org/products/feedfeeder/issues/26 [maurits]
1.0.1 (2010-04-02)
Fix errors when viewing a folder or item on Plone 4, while still keeping Plone 2.5 and Plone 3 compatibility. Refs http://plone.org/products/feedfeeder/issues/25 [maurits]
1.0 (2009-12-23)
Some summaries are a snippet from the full content, and then they can contain broken html; in this case we are now saving the raw broken html, parsing it only when possible. [lucmult]
1.0rc7 (2009-11-06)
Improved the translations stuffs [lucmult]
Changed the way to translate xml/html entities from summary, now using BeautifulSoup. Old way was breaking with some non ascii characters. [lucmult]
When setting the text of a feed item during updating, store the mimetype as well if it is a supported one. Refs http://plone.org/products/feedfeeder/issues/24 [maurits]
1.0rc6 (2009-09-21)
Bug fix: curly quotes getting mangled when Descriptions are built. Fixes http://plone.org/products/feedfeeder/issues/7 (Merged branch maurits-cleaner-entityrefs-in-description.) [maurits]
1.0rc5 (2009-07-02)
Do not add our skin layer to Plone Default and certainly not to Plone Tableless, but just to all (*). [maurits]
1.0rc4 (2009-06-18)
When both the updated and published date of an item is not known, take today as the date when first adding it. When updating, do not change the original item. Fixes http://plone.org/products/feedfeeder/issues/21 [maurits]
Read tags/categories/keywords of feed items and store them on the created content item. No Archetypes field, just a simple getter and setter called feed_tags. Idea: Robin Harms Oredsson. [maurits]
DateTime.SyntaxError is thrown with some very common US Daylight Saving zones, such as EDT. We now wrap the DateTime parsing of feeds, to try to recognise those zones before politely giving up, using maurits’ fix, below. [russf]
Catch DateTime.SyntaxError when parsing the updated and published dates of an entry and continue with the next entry. Fixes http://plone.org/products/feedfeeder/issues/18 [maurits]
Avoid swallowing too much exceptions when applying our GenericSetup profile. Fixes http://plone.org/products/feedfeeder/issues/19 [maurits]
1.0rc3 (2008-10-04)
Moved profile definition from python to GenericSetup. Profile is now not ‘profile-feedfeeder:default’ but ‘profile-Products.feedfeeder:default’. [maurits]
In the Extensions/ dir: removed Install.py and renamed AppInstall.py to install.py. [maurits]
Made feed item updated date available for Collections/Smart Folders. [maurits]
Extensions/AppInstall.py: first try installing our own profile in the Plone 3 way and when that fails try the Plone 2.5 way. [maurits]
Removed own feedparser.py. Instead added an install_requires dependency on FeedParser in setup.py. [maurits]
Moved fix for feeds starting with ‘feed:’ instead of ‘http:’ from feedparser.py to utilities.py, so we use an unchanged feedparser.py again. [maurits]
1.0 rc 2 (2008-07-23)
Re-release of rc1: rc1 was missing all .txt files, making install impossible as setup.py reads version.txt. [reinout]
1.0 rc 1 (2008-07-15)
Accept entries without a title, which is allowed in rss. See http://cyber.law.harvard.edu/rss/rss.html#hrelementsOfLtitemgt [maurits]
1.0 beta 4 (2008-05-20)
Eggification: you can now install it as the Products.feedfeeder egg. [maurits]
1.0 beta 3 (2008-05-13)
In the tests, use plone_workflow explicitly, so it is easier to test on both Plone 2.5 and 3.0. [maurits]
Make update_feed_items available in the object_buttons for Plone 3, using new small @@is_feedcontainer as condition. [maurits]
Avoid deprecation warnings for events and interfaces. [maurits]
Remove semicolon in page template that broke in Plone 3. [maurits]
Fix imports so they work in Plone 3 as well, without deprecation warnings. [derstappenit]
1.0 beta 2 (2008-01-02)
History begins.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.