Skip to main content

API for adding content to the Kolibri content curation server

Project description

ricecooker

The ricecooker library is a framework for creating Kolibri content channels and uploading them to Kolibri Studio, which is the central content server that Kolibri applications talk to when they import content.

The Kolibri content pipeline is pictured below:

The Kolibri Content Pipeline

This ricecooker framework is the "main actor" in the first part of the content pipeline, and touches all aspects of the pipeline within the region highlighted in blue in the above diagram.

Before we continue, let's have some definitions:

  • A Kolibri channel is a tree-like data structure that consist of the following content nodes:
    • Topic nodes (folders)
    • Content types:
      • Document (pdf and epub files)
      • Audio (mp3 files)
      • Video (mp4 files and subtitles)
      • HTML5App h5p files and zip files (generic container for web content: HTML+JS+CSS)
      • Exercises
  • A sushi chef is a Python script that uses the ricecooker library to import content from various sources, organize content into Kolibri channels and upload the channel to Kolibri Studio.

Overview

Use the following shortcuts to jump to the most relevant parts of the ricecooker documentation depending on your role:

Installation

We'll assume you have a Python 3 installation on your computer and are familiar with best practices for working with Python codes (e.g. virtualenv or pipenv). If this is not the case, you can consult the Kolibri developer docs as a guide for setting up a Python virtualenv.

The ricecooker library is a standard Python library distributed through PyPI:

  • Run pip install ricecooker to install You can then use import ricecooker in your chef script.
  • Some of functions in ricecooker.utils require additional software:
    • Make sure you install the command line tool ffmpeg
    • Running javascript code while scraping webpages requires the phantomJS browser. You can run npm install phantomjs-prebuilt in your chef's working directory.

For more details and install options, see docs/installation.md.

Simple chef example

This is a sushi chef script that uses the ricecooker library to create a Kolibri channel with a single topic node (Folder), and puts a single PDF content node inside that folder.

#!/usr/bin/env python
from ricecooker.chefs import SushiChef
from ricecooker.classes.nodes import ChannelNode, TopicNode, DocumentNode
from ricecooker.classes.files import DocumentFile
from ricecooker.classes.licenses import get_license


class SimpleChef(SushiChef):
    channel_info = {
        'CHANNEL_TITLE': 'Potatoes info channel',
        'CHANNEL_SOURCE_DOMAIN': '<domain.org>',         # where you got the content (change me!!)
        'CHANNEL_SOURCE_ID': '<unique id for channel>',  # channel's unique id (change me!!)
        'CHANNEL_LANGUAGE': 'en',                        # le_utils language code
        'CHANNEL_THUMBNAIL': 'https://upload.wikimedia.org/wikipedia/commons/b/b7/A_Grande_Batata.jpg', # (optional)
        'CHANNEL_DESCRIPTION': 'What is this channel about?',      # (optional)
    }

    def construct_channel(self, **kwargs):
        channel = self.get_channel(**kwargs)
        potato_topic = TopicNode(title="Potatoes!", source_id="<potatos_id>")
        channel.add_child(potato_topic)
        doc_node = DocumentNode(
            title='Growing potatoes',
            description='An article about growing potatoes on your rooftop.',
            source_id='pubs/mafri-potatoe',
            license=get_license('CC BY', copyright_holder='University of Alberta'),
            language='en',
            files=[DocumentFile(path='https://www.gov.mb.ca/inr/pdf/pubs/mafri-potatoe.pdf',
                                language='en')],
        )
        potato_topic.add_child(doc_node)
        return channel


if __name__ == '__main__':
    """
    Run this script on the command line using:
        python simple_chef.py -v --reset --token=YOURTOKENHERE9139139f3a23232
    """
    simple_chef = SimpleChef()
    simple_chef.main()

Let's assume the above code snippet is saved as the file simple_chef.py.

You can run the chef script by passing the appropriate command line arguments:

python simple_chef.py -v --reset --token=YOURTOKENHERE9139139f3a23232

The most important argument when running a chef script is --token which is used to pass in the Studio Access Token which you can obtain from your profile's settings page.

The flags -v (verbose) and --reset are generally useful in development. These make sure the chef script will start the process from scratch and displays useful debugging information on the command line.

To see all the ricecooker command line options, run python simple_chef.py -h. For more details about running chef scripts see the chefops page.

If you get an error when running the chef, make sure you've replaced YOURTOKENHERE9139139f3a23232 by the token you obtained from Studio. Also make sure you've changed the value of channel_info['CHANNEL_SOURCE_DOMAIN'] and channel_info['CHANNEL_SOURCE_ID'] instead of using the default values.

Next steps

  • See the usage docs for more explanations about the above code.
  • See nodes to learn how to create different content node types.
  • See file to learn about the file types supported, and how to create them.

Further reading

======= History

0.6.40 (2020-02-07)

  • Changed default behaviour to upload the staging tree instead of the main tree
  • Added --deploy flag to reproduce old bahavior (upload to main tree)
  • Added thumbnail generating methods for audio, HTML5, PDF, and ePub nodes. Set the derive_thumbnail=True when creating the Node instance, or pass the command line argument --thumbnails to generate thumbnails for all nodes. Note: automatic thumbnail generation will only work if thumbnail is None.

0.6.38 (2019-12-27)

  • Added support the h5p content kind and h5p file type
  • Removed monkey-patching of localStorage and document.cookie in the helper method download_static_assets
  • Added validation logic for tags
  • Improved error reporting

0.6.36 (2019-09-25)

  • Added support for tags using the JsonChef workflow
  • Added validation step to ensure subtitles file are unique for each language code
  • Document new SlidesShow content kind coming in Kolibri 0.13
  • Added docs with detailed instruction for content upload and update workflows
  • Bugfixes to file extension logic and improved error handling around subtitles

0.6.32 (2019-08-01)

  • Updated documentation to use top-level headings
  • Removed support for Python 3.4
  • Removed support for the "sous chef" workflow

0.6.31 (2019-07-01)

  • Handle more subtitle convertible formats: SRT, TTML, SCC, DFXP, and SAMI

0.6.30 (2019-05-01)

  • Updated docs build scripts to make ricecooker docs available on read the docs
  • Added corrections command line script for making bulk edits to content metadata
  • Added StudioApi client to support CRUD (created, read, update, delete) Studio actions
  • Added pdf-splitting helper methods (see ricecooker/utils/pdf.py)

0.6.23 (2018-11-08)

  • Updated le-utils and pressurcooker dependencies to latest version
  • Added support for ePub files (EPubFile s can be added of DocumentNode s)
  • Added tag support
  • Changed default value for STUDIO_URL to api.studio.learningequality.org
  • Added aggregator and provider fields for content nodes
  • Various bugfixes to image processing in exercises
  • Changed validation logic to use self.filename to check file format is in self.allowed_formats
  • Added is_youtube_subtitle_file_supported_language helper function to support importing youtube subs
  • Added srt2vtt subtitles conversion
  • Added static assets downloader helper method in utils.downloader.download_static_assets
  • Added LineCook chef functions to --generate CSV from directory structure
  • Fixed the always randomize=True bug
  • Docs: general content node metadata guidelines
  • Docs: video compression instructions and helper scripts convertvideo.bat and convertvideo.sh

0.6.17 (2018-04-20)

0.6.15 (2018-03-06)

  • Added support for non-mp4 video files, with auto-conversion using ffmpeg. See git diff b1d15fa 87f2528
  • Added CSV exercises workflow support to LineCook chef class
  • Added --nomonitor CLI argument to disable sushibar functionality
  • Defined new ENV variables:
    • PHANTOMJS_PATH: set this to a phantomjs binary (instead of assuming one in node_modules)
    • STUDIO_URL (alias CONTENTWORKSHOP_URL): set to URL of Kolibri Studio server where to upload files
  • Various fixes to support sushi chefs
  • Removed minimize_html_css_js utility function from ricecooker/utils/html.py to remove dependency on css_html_js_minify and support Py3.4 fully.

0.6.9 (2017-11-14)

  • Changed default logging level to --verbose
  • Added support for cronjobs scripts via --cmdsock (see docs/daemonization.md)
  • Added tools for creating HTML5Zip files in utils/html_writer.py
  • Added utility for downloading HTML with optional js support in utils/downloader.py
  • Added utils/path_builder.py and utils/data_writer.py for creating souschef archives (zip archive that contains files in a folder hierarchy + Channel.csv + Content.csv)

0.6.7 (2017-10-04)

  • Sibling content nodes are now required to have unique source_id
  • The field copyright_holder is required for all licenses other than public domain

0.6.7 (2017-10-04)

  • Sibling content nodes are now required to have unique source_id
  • The field copyright_holder is required for all licenses other than public domain

0.6.6 (2017-09-29)

  • Added JsonTreeChef class for creating channels from ricecooker json trees
  • Added LineCook chef class to support souschef-based channel workflows

0.6.4 (2017-08-31)

  • Added language attribute for ContentNode (string key in internal repr. defined in le-utils)
  • Made language a required attribute for ChannelNode
  • Enabled sushibar.learningequality.org progress monitoring by default Set SUSHIBAR_URL env. var to control where progress is reported (e.g. http://localhost:8001)
  • Updated le-utils and pressurecooker dependencies to latest

0.6.2 (2017-07-07)

  • Clarify ricecooker is Python3 only (for now)
  • Use https:// and wss:// for SuhiBar reporting

0.6.0 (2017-06-28)

  • Remote progress reporting and logging to SushiBar (MVP version)
  • New API based on the SuchiChef classes
  • Support existing old-API chefs in compatibility mode

0.5.13 (2017-06-15)

  • Last stable release before SushiBar functionality was added
  • Renamed --do-not-activate argument to --stage

0.1.0 (2016-09-30)

  • First release on PyPI.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ricecooker-0.6.40.tar.gz (1.3 MB view details)

Uploaded Source

File details

Details for the file ricecooker-0.6.40.tar.gz.

File metadata

  • Download URL: ricecooker-0.6.40.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.36.1 CPython/3.6.9

File hashes

Hashes for ricecooker-0.6.40.tar.gz
Algorithm Hash digest
SHA256 fbaeb20d0c76c28b8b8ca0ecd50e1ec99b99056283f4b600c8ea3fcce7af8fdd
MD5 dd12d0a3502b59484da01bc762ec517c
BLAKE2b-256 a94571771e4a38c3636e47f4fbcf63146e6749ee7af6bf6b70932a5b13c9ab71

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page