Feed reader App for Opps CMS
Project description
opps-feedcrawler
================
FeedCrawler takes a **feed** of any type, executes its customized processor in order to create CMS Entries.
Feed
====
Feed is commonly a url with some configurations, **url**, **credentials**, **processor** and **actions**
The most simple example is an RSS feed
- url = 'http://site.com/feed.rss'
- processor = 'opps.feedcrawler.processors.rss.RSSProcessor'
- actions = ['opps.feedcrawler.actions.rss.RSSActions'
In the above example we have an **url** to read feed entries, and feedcrawler comes with a builtin processor for RSS feeds
**RSSProcessor** wil take the feed url and do all the job fetching, reading and creating **entries** on database.
> You can replace RSSProcessor with your own processor class, following the processor API.
> Example: 'yourproject.yourmodule.processors.MyProcessor'
> The processor API is documented in the item **Processor API**
Also, your **feed** takes **actions** which is a path to a callable returning a list of Django admin actions in the form of functions.
an example of action is "Create posts" which takes the selected entries and convert it in to Opps Posts.
Processor API
=============
feedcrawler provides a **BaseProcessor** class for you to extend and you have to override some methods.
from opps.feedcrawler.processors.base import BaseProcessor
class MyProcessor(BaseProcessor):
"""
BaseProcessor.__init__ receives the **feed** object as parameter
def __init__(feed, entry_model, *args, **kwargs):
self.feed = feed
self.entry_model = entry_model
You override if you need, but be careful.
"""
def process(self):
url = self.feed.source_url
max_entries = self.feed.max_entries
...
# here you have access to the **feed** object in **self.feed**
entries = read_and_parse_rss_feed(url) # example function which fetch and parse XML feed
# Now you have access to **self.entry_model** which you will use to create CMS entries.
for entry in entries:
# remember to implement your own logic to avoid duplications
self.entry_model.objects.get_or_create(
title=entry['title']
...
...
)
# this method should return the count of entries read and created or 0
return len(entries)
The processor above will be executed by management command **manage.py process_feeds -f feed_slug** also you can put this command to run on **cron** or **celery**
================
FeedCrawler takes a **feed** of any type, executes its customized processor in order to create CMS Entries.
Feed
====
Feed is commonly a url with some configurations, **url**, **credentials**, **processor** and **actions**
The most simple example is an RSS feed
- url = 'http://site.com/feed.rss'
- processor = 'opps.feedcrawler.processors.rss.RSSProcessor'
- actions = ['opps.feedcrawler.actions.rss.RSSActions'
In the above example we have an **url** to read feed entries, and feedcrawler comes with a builtin processor for RSS feeds
**RSSProcessor** wil take the feed url and do all the job fetching, reading and creating **entries** on database.
> You can replace RSSProcessor with your own processor class, following the processor API.
> Example: 'yourproject.yourmodule.processors.MyProcessor'
> The processor API is documented in the item **Processor API**
Also, your **feed** takes **actions** which is a path to a callable returning a list of Django admin actions in the form of functions.
an example of action is "Create posts" which takes the selected entries and convert it in to Opps Posts.
Processor API
=============
feedcrawler provides a **BaseProcessor** class for you to extend and you have to override some methods.
from opps.feedcrawler.processors.base import BaseProcessor
class MyProcessor(BaseProcessor):
"""
BaseProcessor.__init__ receives the **feed** object as parameter
def __init__(feed, entry_model, *args, **kwargs):
self.feed = feed
self.entry_model = entry_model
You override if you need, but be careful.
"""
def process(self):
url = self.feed.source_url
max_entries = self.feed.max_entries
...
# here you have access to the **feed** object in **self.feed**
entries = read_and_parse_rss_feed(url) # example function which fetch and parse XML feed
# Now you have access to **self.entry_model** which you will use to create CMS entries.
for entry in entries:
# remember to implement your own logic to avoid duplications
self.entry_model.objects.get_or_create(
title=entry['title']
...
...
)
# this method should return the count of entries read and created or 0
return len(entries)
The processor above will be executed by management command **manage.py process_feeds -f feed_slug** also you can put this command to run on **cron** or **celery**
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
opps-feedcrawler-0.2.1.tar.gz
(31.4 kB
view details)
Built Distribution
opps_feedcrawler-0.2.1-py2.7.egg
(106.6 kB
view details)
File details
Details for the file opps-feedcrawler-0.2.1.tar.gz
.
File metadata
- Download URL: opps-feedcrawler-0.2.1.tar.gz
- Upload date:
- Size: 31.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe93d09f3acb896969ab95d0e97e799d1d50dfb8f559fe0dacd7cc76dfff7f7e |
|
MD5 | bcf26d6e169000b4ff114f47ed83e89d |
|
BLAKE2b-256 | 6ea7c58103253646c67fd999720dde61a80d177342d4ecf52e11ac001426c661 |
File details
Details for the file opps_feedcrawler-0.2.1-py2.7.egg
.
File metadata
- Download URL: opps_feedcrawler-0.2.1-py2.7.egg
- Upload date:
- Size: 106.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd0df66c7d5b6d80ac82f0e4709017b140f3a549644197214034ca808c17b991 |
|
MD5 | 18ee38ae43b7a67ce044e83e5cefa706 |
|
BLAKE2b-256 | 701c35c9fea007e37e5ddb9c59770b575a2afc31eac44585f77cbd04d48c1b63 |