Skip to main content

Python MediaWiki Bot Framework

Project description

Github CI AppVeyor Build Status Code coverage Maintainability Python Pywikibot release Total downloads Monthly downloads

Pywikibot

The Pywikibot framework is a Python library that interfaces with the MediaWiki API version 1.23 or higher.

Also included are various general function scripts that can be adapted for different tasks.

For further information about the library excluding scripts see the full code documentation.

Quick start

pip install requests
git clone https://gerrit.wikimedia.org/r/pywikibot/core.git
cd core
git submodule update --init
python pwb.py script_name

Or to install using PyPI (excluding scripts)

pip install -U setuptools
pip install pywikibot
pwb <scriptname>

In addition a MediaWiki markup parser is required. Please install one of them:

pip install mwparserfromhell

or

pip install wikitextparser

Our installation guide has more details for advanced usage.

Basic Usage

If you wish to write your own script it’s very easy to get started:

import pywikibot
site = pywikibot.Site('en', 'wikipedia')  # The site we want to run our bot on
page = pywikibot.Page(site, 'Wikipedia:Sandbox')
page.text = page.text.replace('foo', 'bar')
page.save('Replacing "foo" with "bar"')  # Saves the page

Wikibase Usage

Wikibase is a flexible knowledge base software that drives Wikidata. A sample pywikibot script for getting data from Wikibase:

import pywikibot
site = pywikibot.Site('wikipedia:en')
repo = site.data_repository()  # the Wikibase repository for given site
page = repo.page_from_repository('Q91')  # create a local page for the given item
item = pywikibot.ItemPage(repo, 'Q91')  # a repository item
data = item.get()  # get all item data from repository for this item

Script example

Pywikibot provides bot classes to develop your own script easily:

import pywikibot
from pywikibot import pagegenerators
from pywikibot.bot import ExistingPageBot

class MyBot(ExistingPageBot):

    update_options = {
        'text': 'This is a test text',
        'summary: 'Bot: a bot test edit with Pywikibot.'
    }

    def treat_page(self):
        """Load the given page, do some changes, and save it."""
        text = self.current_page.text
        text += '\n' + self.opt.text
        self.put_current(text, summary=self.opt.summary)

def main():
    """Parse command line arguments and invoke bot."""
    options = {}
    gen_factory = pagegenerators.GeneratorFactory()
    # Option parsing
    local_args = pywikibot.handle_args(args)  # global options
    local_args = gen_factory.handle_args(local_args)  # generators options
    for arg in local_args:
        opt, sep, value = arg.partition(':')
        if opt in ('-summary', '-text'):
            options[opt[1:]] = value
    MyBot(generator=gen_factory.getCombinedGenerator(), **options).run()

if __name == '__main__':
    main()

For more documentation on Pywikibot see our docs.

Required external programs

It may require the following programs to function properly:

  • 7za: To extract 7z files

Roadmap

Current release changes

Improvements

  • i18n updates for date.py

  • Add number transliteration of ‘lo’, ‘ml’, ‘pa’, ‘te’ to NON_LATIN_DIGITS

  • Detect range blocks with Page.is_blocked() method (T301282)

  • to_latin_digits() function was added to textlib as counterpart of to_local_digits() function

  • api.Request.submit now handles search-title-disabled and search-text-disabled API Errors

  • A show_diff parameter was added to Page.put() and Page.change_category()

  • Allow categories when saving IndexPage (T299806)

  • Add a new function case_escape to textlib

  • Support inheritance of the __STATICREDIRECT__

  • Avoid non-deteministic behavior in removeDisableParts

  • Update isbn dependency and require python-stdnum >= 1.17

  • Synchronize Page.linkedPages() parameters with Site.pagelinks() parameters

  • Scripts hash bang was changed from python to python3

  • i18n.bundles(), i18n.known_languages and i18n._get_bundle() functions were added

  • Raise ConnectionError immediately if urllib3.NewConnectionError occurs (T297994, 298859)

  • Make pywikibot messages available with site package (T57109, T275981)

  • Add support for API:Redirects

  • Enable shell script with Pywikibot site package

  • Enable generate_user_files.py and generate_family_file with site-package (T107629)

  • Add support for Python 3.11

  • Pywikibot supports PyPy 3 (T101592)

  • A new method User.is_locked() was added to determine whether the user is currently locked globally (T249392)

  • A new method APISite.is_locked() was added to determine whether a given user or user id is locked globally (T249392)

  • APISite.get_globaluserinfo() method was added to retrieve globaluserinfo for any user or user id (T163629)

  • APISite.globaluserinfo attribute may be deleted to force reload

  • APISite.is_blocked() method has a force parameter to reload that info

  • Allow family files in base_dir by default

  • Make pwb wrapper script a pywikibot entry point for scripts (T139143, T270480)

  • Enable -version and –version with pwb wrapper or code entry point (T101828)

  • Add title_delimiter_and_aliases attribute to family files to support WikiHow family (T294761)

  • BaseBot has a public collections.Counter for reading, writing and skipping a page

  • Upload: Retry upload if ‘copyuploadbaddomain’ API error occurs (T294825)

  • Update invisible characters from unicodedata 14.0.0

  • Add support for Wikimedia OCR engine with proofreadpage

  • Rewrite tools.intersect_generators which makes it running up to 10’000 times faster. (T85623, T293276)

  • The cached output functionality from compat release was re-implemented (T151727, T73646, T74942, T132135, T144698, T196039, T280466)

  • L10N updates

  • Adjust groupsize within pagegenerators.PreloadingGenerator (T291770)

  • New “maxlimit” property was added to APISite (T291770)

Bugfixes

  • Don’t raise an exception if BlockEntry initializer found a hidden title (T78152)

  • Fix KeyError in create_warnings_list (T301610)

  • Enable similar script call of pwb.py on toolforge (T298846)

  • Remove question mark character from forbidden file name characters (T93482)

  • Enable -interwiki option with pagegenerators (T57099)

  • Don’t assert login result (T298761)

  • Allow title placeholder $1 in the middle of an url (T111513, T298078)

  • Don’t create a Site object if pywikibot is not fully imported (T298384)

  • Use page.site.data_repository when creating a _WbDataPage (T296985)

  • Fix mysql AttributeError for sock.close() on toolforge (T216741)

  • Only search user_script_paths inside config.base_dir (T296204)

  • pywikibot.argv has been fixed for pwb.py wrapper if called with global args (T254435)

  • Only ignore FileExistsError when creating the api cache (T295924)

  • Only handle query limit if query module is limited (T294836)

  • Upload: Only set filekey/offset for files with names (T294916)

  • Make site parameter of textlib.replace_links() mandatory (T294649)

  • Raise a generic ServerError if the http status code is unofficial (T293208)

Breaking changes

  • Support of Python 3.5.0 - 3.5.2 has been dropped (T286867)

  • generate_user_files.py, generate_user_files.py, shell.py and version.py were moved to pywikibot/scripts and must be used with pwb wrapper script

  • See also Code cleanups below

Code cleanups

  • Deprecated http.get_fake_user_agent() function was removed

  • FilePage.fileIsShared() was removed in favour of FilePage.file_is_shared()

  • Page.canBeEdited() was removed in favour of Page.has_permission()

  • BaseBot.stop() method were removed in favour of BaseBot.generator.close()

  • showHelp() function was remove in favour of show_help

  • CombinedPageGenerator pagegenerator was removed in favour of itertools.chain

  • Remove deprecated echo.Notification.id

  • Remove APISite.newfiles() method (T168339)

  • Remove APISite.page_exists() method

  • Raise a TypeError if BaseBot.init_page return None

  • Remove private upload parameters in UploadRobot.upload_file(), FilePage.upload() and APISite.upload() methods

  • Raise an Error exception if ‘titles’ is still used as where parameter in Site.search()

  • Deprecated version.get_module_version() function was removed

  • Deprecated setOptions/getOptions OptionHandler methods were removed

  • Deprecated from_page() method of CosmeticChangesToolkit was removed

  • Deprecated diff attribute of CosmeticChangesToolkit was removed in favour of show_diff

  • Deprecated namespace and pageTitle parameter of CosmeticChangesToolkit were removed

  • Remove deprecated BaseSite namespace shortcuts

  • Remove deprecated Family.get_cr_templates method in favour of Site.category_redirects()

  • Remove deprecated Page.put_async() method (T193494)

  • Ignore baserevid parameter for several DataSite methods

  • Remove deprecated preloaditempages method

  • Remove disable_ssl_certificate_validation kwargs in http functions in favour of verify parameter (T265206)

  • Deprecated PYWIKIBOT2 environment variables were removed

  • version.ParseError was removed in favour of exceptions.VersionParseError

  • specialbots.EditReplacement and specialbots.EditReplacementError were removed in favour of exceptions.EditReplacementError

  • site.PageInUse exception was removed in favour of exceptions.PageInUseError

  • page.UnicodeToAsciiHtml and page.unicode2html were removed in favour of tools.chars.string_to_ascii_html and tools.chars.string2html

  • interwiki_graph.GraphImpossible and login.OAuthImpossible exception were removed in favour of ImportError

  • i18n.TranslationError was removed in favour of exceptions.TranslationError

  • WikiaFamily was removed in favour of FandomFamily

  • data.api exceptions were removed in favour of exceptions module

  • cosmetic_changes CANCEL_ALL/PAGE/METHOD/MATCH constants were removed in favour of CANCEL enum

  • pywikibot.__release__ was removed in favour of pywikibot.__version__

  • TextfilePageGenerator was replaced by TextIOPageGenerator

  • PreloadingItemGenerator was replaced by PreloadingEntityGenerator

  • DuplicateFilterPageGenerator was replaced by tools.filter_unique

  • ItemPage.concept_url method was replaced by ItemPage.concept_uri

  • Outdated parameter names has been dropped

  • Deprecated pywikibot.Error exception were removed in favour of pywikibot.exceptions.Error classes (T280227)

  • Deprecated exception identifiers were removed (T280227)

  • Deprecated date.FormatDate class was removed in favour of date.format_date function

  • language_by_size property of wowwiki Family was removed in favour of codes attribute

  • availableOptions was removed in favour of available_options

  • config2 was removed in favour of config

  • tools.RotatingFileHandler was removed in favour of logging.handlers.RotatingFileHandler

  • tools.DotReadableDict, tools.LazyRegex and tools.DeprecatedRegex classes were removed

  • tools.frozenmap was removed in favour of types.MappingProxyType

  • tools.empty_iterator() was removed in favour of iter(())

  • tools.concat_options() function was removed in favour of bot_choice.Option

  • tools.is_IP was be removed in favour of tools.is_ip_address()

  • textlib.unescape() function was be removed in favour of html.unescape()

  • APISite.deletepage() and APISite.deleteoldimage() methods were removed in favour of APISite.delete()

  • APISite.undeletepage() and APISite.undelete_file_versions() were be removed in favour of APISite.undelete() method

Deprecations

  • 7.0.0: The i18n identifier ‘cosmetic_changes-append’ will be removed in favour of ‘pywikibot-cosmetic-changes’

  • 7.0.0: User.isBlocked() method is renamed to is_blocked for consistency

  • 7.0.0: Require mysql >= 0.7.11 (T216741)

  • 7.0.0: Private BaseBot counters _treat_counter, _save_counter, _skip_counter will be removed in favour of collections.Counter counter attribute

  • 7.0.0: A boolean watch parameter in Page.save() is deprecated and will be desupported

  • 7.0.0: baserevid parameter of editSource(), editQualifier(), removeClaims(), removeSources(), remove_qualifiers() DataSite methods will be removed

  • 7.0.0: Values of APISite.allpages() parameter filterredir other than True, False and None are deprecated

  • 6.5.0: OutputOption.output() method will be removed in favour of OutputOption.out property

  • 6.5.0: Infinite rotating file handler with logfilecount of -1 is deprecated

  • 6.4.0: ‘allow_duplicates’ parameter of tools.intersect_generators as positional argument is deprecated, use keyword argument instead

  • 6.4.0: ‘iterables’ of tools.intersect_generators given as a list or tuple is deprecated, either use consecutive iterables or use ‘*’ to unpack

  • 6.2.0: outputter of OutputProxyOption without out property is deprecated

  • 6.2.0: ContextOption.output_range() and HighlightContextOption.output_range() are deprecated

  • 6.2.0: Error messages with ‘%’ style is deprecated in favour for str.format() style

  • 6.2.0: page.url2unicode() function is deprecated in favour of tools.chars.url2string()

  • 6.2.0: Throttle.multiplydelay attribute is deprecated

  • 6.2.0: SequenceOutputter.format_list() is deprecated in favour of ‘out’ property

  • 6.0.0: config.register_family_file() is deprecated

  • 5.5.0: APISite.redirectRegex() is deprecated in favour of APISite.redirect_regex()

  • 4.0.0: Revision.parent_id is deprecated in favour of Revision.parentid

  • 4.0.0: Revision.content_model is deprecated in favour of Revision.contentmodel

Release history

See https://github.com/wikimedia/pywikibot/blob/stable/HISTORY.rst

Contributing

Our code is maintained on Wikimedia’s Gerrit installation, learn how to get started.

Code of Conduct

The development of this software is covered by a Code of Conduct.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pywikibot-7.0.0.tar.gz (557.9 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page