Skip to main content

A base library for writing your own log scraper, i.e. something that can run regexes over files and give you meaningful information like stats. Add your own regexes and plug and play. See the readme for more information.

Project description

LogScraper
==========

A generic library for gathering stats from log files by running regexes
on them. Things you can do: \* Create and run any number of regexes on
any number of files in parallel. \* Aggregate stats by creating named
regex groups in your regexes \* Grab archived logs (so long as you tell
it where your archives live) \* Grab files from remote boxes \* Print
stats to console \* Print regex matches to console \* Search on gzipped
files

Installation
------------

The easiest manner of installation is to grab the package from the PyPI
repository.

::

pip install log_scraper

Usage
-----

Base Usage
^^^^^^^^^^

For off the cuff usage, you can just create a LogScraper object and tell
it what regexes to run and where to look for files. Eg.

::

from log_scraper.base import LogScraper
import log_scraper.consts as LSC

filepath = '/path/to/file'
filename = 'filename.ext'
scraper = LogScraper(default_filepath={LSC.DEFAULT_PATH : filepath, LSC.DEFAULT_FILENAME : filename})
scraper.add_regex(name='regex1', pattern=r'your_regex_here')

# To get aggregated stats
data = scraper.get_log_data()

# To print all the stats
scraper.print_total_stats(data)

# To print each file's individual stats
scraper.print_stats_per_file(data)

# To view log lines matching the regex
scraper.view_regex_matches(scraper.get_regex_matches())

The real power, though, is in creating your own class deriving from
LogScraper that presets the paths and the regexes to run so that anyone
can then use that anywhere to mine data from a process' logs.

Development
-----------

Dependencies
~~~~~~~~~~~~

- Python 2.7
- `paramiko <http://paramiko-www.readthedocs.org/en/latest/index.html>`_

Testing
~~~~~~~

To test successfully, you must set up a virtual environment On Unix, in
the root folder for the package, do the following:
``python -m virtualenv . source ./bin/activate ./bin/python setup.py develop``

Now you can make any changes you want and then run the unit-tests by
doing:

::

./bin/python setup.py test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

log_scraper-0.9.9.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

log_scraper-0.9.9-py2-none-any.whl (14.6 kB view details)

Uploaded Python 2

File details

Details for the file log_scraper-0.9.9.tar.gz.

File metadata

  • Download URL: log_scraper-0.9.9.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for log_scraper-0.9.9.tar.gz
Algorithm Hash digest
SHA256 9e50f5e7864ab2c56bd31644690446a429191f071f559fba954557d017c896b0
MD5 adb2271ceb69fd7204d8a4485f21c6e4
BLAKE2b-256 981a0dbae964b86eafbff7e4c9d57e82010a84289002bbcfa2394b7edd63c998

See more details on using hashes here.

File details

Details for the file log_scraper-0.9.9-py2-none-any.whl.

File metadata

File hashes

Hashes for log_scraper-0.9.9-py2-none-any.whl
Algorithm Hash digest
SHA256 af857d6b1aa44d1670b3585c6e5f7d76382a1df9c5325e092cc171ee9bb48b49
MD5 9c7958a56051895c7e486aafdc0bb3f8
BLAKE2b-256 e0351f5dae4516e20db134a916bffd6270e391326b6a2f012edbdca91e2240e5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page