Skip to main content

Feed reader App for Opps CMS

Project description

opps-feedcrawler
================

FeedCrawler takes a **feed** of any type, executes its customized processor in order to create CMS Entries.


Feed
====

Feed is commonly a url with some configurations, **url**, **credentials**, **processor** and **actions**

The most simple example is an RSS feed

- url = 'http://site.com/feed.rss'
- processor = 'opps.feedcrawler.processors.rss.RSSProcessor'
- actions = ['opps.feedcrawler.actions.rss.RSSActions'

In the above example we have an **url** to read feed entries, and feedcrawler comes with a builtin processor for RSS feeds
**RSSProcessor** wil take the feed url and do all the job fetching, reading and creating **entries** on database.

> You can replace RSSProcessor with your own processor class, following the processor API.

> Example: 'yourproject.yourmodule.processors.MyProcessor'

> The processor API is documented in the item **Processor API**


Also, your **feed** takes **actions** which is a path to a callable returning a list of Django admin actions in the form of functions.
an example of action is "Create posts" which takes the selected entries and convert it in to Opps Posts.

Processor API
=============

feedcrawler provides a **BaseProcessor** class for you to extend and you have to override some methods.



from opps.feedcrawler.processors.base import BaseProcessor

class MyProcessor(BaseProcessor):
"""
BaseProcessor.__init__ receives the **feed** object as parameter

def __init__(feed, entry_model, *args, **kwargs):
self.feed = feed
self.entry_model = entry_model

You override if you need, but be careful.
"""

def process(self):
url = self.feed.source_url
max_entries = self.feed.max_entries
...

# here you have access to the **feed** object in **self.feed**
entries = read_and_parse_rss_feed(url) # example function which fetch and parse XML feed

# Now you have access to **self.entry_model** which you will use to create CMS entries.
for entry in entries:
# remember to implement your own logic to avoid duplications
self.entry_model.objects.get_or_create(
title=entry['title']
...
...
)

# this method should return the count of entries read and created or 0
return len(entries)



The processor above will be executed by management command **manage.py process_feeds -f feed_slug** also you can put this command to run on **cron** or **celery**

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opps-feedcrawler-0.2.1.tar.gz (31.4 kB view details)

Uploaded Source

Built Distribution

opps_feedcrawler-0.2.1-py2.7.egg (106.6 kB view details)

Uploaded Source

File details

Details for the file opps-feedcrawler-0.2.1.tar.gz.

File metadata

File hashes

Hashes for opps-feedcrawler-0.2.1.tar.gz
Algorithm Hash digest
SHA256 fe93d09f3acb896969ab95d0e97e799d1d50dfb8f559fe0dacd7cc76dfff7f7e
MD5 bcf26d6e169000b4ff114f47ed83e89d
BLAKE2b-256 6ea7c58103253646c67fd999720dde61a80d177342d4ecf52e11ac001426c661

See more details on using hashes here.

File details

Details for the file opps_feedcrawler-0.2.1-py2.7.egg.

File metadata

File hashes

Hashes for opps_feedcrawler-0.2.1-py2.7.egg
Algorithm Hash digest
SHA256 dd0df66c7d5b6d80ac82f0e4709017b140f3a549644197214034ca808c17b991
MD5 18ee38ae43b7a67ce044e83e5cefa706
BLAKE2b-256 701c35c9fea007e37e5ddb9c59770b575a2afc31eac44585f77cbd04d48c1b63

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page