semgrep

Fast and syntax-aware semantic code pattern search for many languages: like grep but for code

These details have been verified by PyPI

Maintainers

brendongo r2c r2c-drewdennison underyx

These details have not been verified by PyPI

Project links

Homepage

Project description

Semgrep logo

Lightweight static analysis for many languages.
Find and block bug variants with rules that look like source code.

Installation · Motivation · Overview · Usage
Resources · Contributing · Commercial Support

Semgrep is a command-line tool for offline static analysis. Use pre-built or custom rules to enforce code and security standards in your codebase. You can try it now with our interactive live editor.

Semgrep combines the convenient and iterative style of grep with the powerful features of an Abstract Syntax Tree (AST) matcher and limited dataflow. Easily find function calls, class or method definitions, and more without having to understand ASTs or wrestle with regexes.

Visit Installation and Usage to get started.

Installation

Want to skip installation? You can run Semgrep via our interactive live editor at semgrep.live.

On macOS, binaries are available via Homebrew:

$ brew install semgrep

On Ubuntu, an install script is available with each release

$ ./semgrep-v0.12.0-ubuntu-generic.sh

To try Semgrep without installation, you can also run it via Docker:

$ docker run --rm -v "${PWD}:/home/repo" returntocorp/semgrep --help

See Usage to learn about running pre-built rules and writing custom ones.

Motivation

Semgrep exists because:

Insecure code is easy to write
The future of security involves automatically guiding developers towards a “paved road” made of default-safe frameworks (i.e. React or Object-relational Mappers)
grep isn’t expressive enough and traditional static analysis tools (SAST) are too complicated/slow for paved road automation

The AppSec, Developer, and DevOps communities deserve a static analysis tool that is fast, easy to use, code-aware, multi-lingual, and open source!

Overview

Semgrep is optimized for:

Speed: Fast enough to run on every build, commit, or file save
Finding bugs that matter: Run your own specialized rules or choose OWASP 10 checks from the Semgrep Registry. Rules match source code at the Abstract Syntax Tree (AST) level, unlike regexes that match strings and aren't semantically aware.
Ease of customization: Rules look like the code you’re searching, no static analysis PhD required. They don't require compiled code, only source, reducing iteration time.
Ease of integration. Highly portable and many CI and git-hook integrations already exist. Output --json and pipe results into your existing systems.
Polyglot environments: Don't learn and maintain multiple tools for your polyglot environment (e.g. ESLint, find-sec-bugs, RuboCop, Gosec). Use the same syntax and concepts independent of language.

Language Support

Python	JavaScript	Go	Java	C	JSON	Ruby	OCaml	TypeScript	PHP
✅	✅	✅	✅	✅	✅	🚧	🚧	Coming...	Coming...

Missing support for a language? Let us know by filing a ticket, joining our Slack, or emailing support@r2c.dev.

Pattern Syntax Teaser

One of the most unique and useful things about Semgrep is how easy it is to write and iterate on queries.

The goal is to make it as easy as possible to go from an idea in your head to finding the code patterns you intend to.

Example: Say you want to find all calls to a function named exec, and you don't care about the arguments. With Semgrep, you could simply supply the pattern exec(...) and you'd match:

# Simple cases grep finds
exec("ls")
exec(some_var)

# But you don't have to worry about whitespace
exec (foo)

# Or calls across multiple lines
exec (
    bar
)

Importantly, Semgrep would not match the following:

# grep would match this, but Semgrep ignores it because
# it doesn't have the right function name
other_exec(bar)

# Semgrep ignores commented out lines
# exec(foo)

# and hard-coded strings
print("exec(bar)")

Semgrep will even match aliased imports:

# Semgrep knows that safe_function refers to exec so it
# will still match!
#   Oof, try finding this with grep
import exec as safe_function
safe_function(tricksy)

Play with this example in your browser here, or copy the above code into a file locally (exec.py) and run:

$ semgrep -l python -e "exec(...)" /path/to/exec.py

More example patterns:

Pattern	Matches
`$X == $X`	`if (node.id == node.id): ...`
`requests.get(..., verify=False, ...)`	`requests.get(url, timeout=3, verify=False)`
`os.system(...)`	`from os import system; system('echo semgrep')`
`$ELEMENT.innerHTML`	el.innerHTML = "<img src='x' onerror='alert(`XSS`)'>";
`$TOKEN.SignedString([]byte("..."))`	`ss, err := token.SignedString([]byte("HARDCODED KEY"))`

→ see more example patterns in the Semgrep Registry.

For more info on what you can do in patterns, see the pattern features docs.

Usage

Semgrep supports three primary workflows:

Run pre-built rules
Writing custom rules
Run Semgrep continously in CI

The following sections cover each in more detail.

Run Pre-Built Rules

The easiest way to get started with Semgrep (other than semgrep.live) is to scan your code with pre-built rules.

The Semgrep Registry contains rules for many programming errors, including security issues and correctness bugs. Security rules are annotated with CWE and OWASP metadata when applicable. OWASP rule coverage per language is displayed below.

You can use pre-built Rule Packs, that contain sets of rules grouped by language and/or framework:

$ semgrep --config=https://semgrep.live/c/p/java
$ semgrep --config=https://semgrep.live/c/p/python
$ semgrep --config=https://semgrep.live/c/p/golang
$ semgrep --config=https://semgrep.live/c/p/javascript
...

Or you can run all of Semgrep's default rules for all languages as appropriate (note: each rule says what language it's for, so Semgrep won't try to run a Python rule on Java code).

$ semgrep --config=r2c

You can also run a specific rule or group of rules:

# Run a specific rule
$ semgrep --config=https://semgrep.live/c/r/java.spring.security.audit.cookie-missing-samesite

# Run a set of rules
$ semgrep --config=https://semgrep.live/c/r/java.spring.security

All public Semgrep rules can be viewed on the Registry, which pulls the rules from YAML files defined in the semgrep-rules GitHub repo.

Here are some sample vulnerable repos to test on:

Writing Custom Rules

One of the strengths of Semgrep is how easy it is to write rules.

This makes it possible to:

Quickly port rules from other tools.
Think of an interesting code pattern, and then find instances of it in your code.
Find code base or org-specific bugs and antipatterns - things that built-in checks for existing tools won't find because they're unique to you.
and more!

Simple Rules

For iterating on simple patterns, you can use the --lang and --pattern flags.

$ semgrep --lang javascript --pattern 'eval(...)' path/to/file.js

The --lang flag tells Semgrep which language you're targeting and --pattern is the code pattern to search for.

Advanced Rules

Some rules need more than one line of pattern to express. Sometimes you want to express code patterns, like: X must be true AND Y must be too, or X but NOT Y, or X must occur inside a block of code that Y matches.

For these cases, Semgrep has a more powerful and flexible YAML syntax.

You can run a single rule or directory of rules specified in YAML by:

$ semgrep --config my_rule.yml path/to/dir_or_file

$ semgrep --config yaml_dir/ path/to/dir_or_file

Example Advanced Rule

Say you are building a financial trading application in which every Transaction object must first be passed to verify_transaction() before being passed to make_transaction(), or it's a business logic bug.

You can express this behavior with the following Semgrep YAML pattern:

rules:
- id: find-unverified-transactions
  patterns:
    - pattern: |
        public $RETURN $METHOD(...){
            ...
            make_transaction($T);
            ...
        }
    - pattern-not: |
        public $RETURN $METHOD(...){
            ...
            verify_transaction($T);
            ...
            make_transaction($T);
            ...
        }
  message: |
    In $METHOD, there's a call to make_transaction() without first calling verify_transaction() on the Transaction object.

$RETURN, $METHOD, and $T are metavariables, an abstraction that Semgrep provides when you want to match something but you don't know exactly what it is ahead of time.
- You can think of metavariables like a capture group in regular expressions.
The pattern clause defines what we're looking for: any method that calls make_transaction().
The pattern-not clause filters out matches we don't want; in this case, methods where a transaction ($T) is passed to verify_transaction() before make_transaction().
The message is what's returned in Semgrep output, either to STDOUT or as a comment on the pull request on GitHub or other systems.
- Note that metavariables can be used to customize messages and make them contextually relevant. Here we're helpfully telling the user the method where we've identified the bug.

You can play with this transaction example here: https://semgrep.live/4b4g.

Learn More

See the pattern features docs for more info and examples on the flexibility and power of Semgrep patterns.
See the YAML configuration file docs for details on all of the keys that can be used and how they work.

Run Semgrep Continously in CI

Semgrep can be run via CLI or Docker and output results as JSON (via the --json flag), so it can be inserted into any CI pipeline and have its results processed by whatever tools you're using.

Semgrep is aware of diffs, so it can report only findings that occur in newly added code, for example, in a commit or pull request.

Currently, the easiest way to integrate Semgrep into CI is via a GitHub action we've built. See the integrations docs for more details.

Semgrep can also output results in the standardized Static Analysis Results Interchange Format (SARIF) with the --sarif flag, if you use tools that accept this format.

Upgrading

How you upgrade Semgrep will depend on how you installed it.

From Homebrew:

$ brew upgrade semgrep

From PyPI:

$ python -m pip install --upgrade semgrep

From Docker:

$ docker pull returntocorp/semgrep:latest

Resources

Learn more:

Semgrep presentation and slides from the Bay Area OWASP meetup.
Check out the r2c YouTube channel for more videos.
More detailed Semgrep docs

Get in touch:

Submit a bug report
Join our community Slack to say "hi" or ask questions

Contributing

Semgrep is LGPL-licensed, feel free to help out: CONTRIBUTING.

Semgrep is a frontend to a larger program analysis library named pfff. pfff began and was open-sourced at Facebook but is now archived. The primary maintainer now works at r2c. Semgrep was originally named sgrep and was renamed to avoid collisons with existing projects.

Commercial Support

Semgrep is proudly supported by r2c. We're hiring!

Interested in a fully-supported, hosted version of semgrep? Drop your email and we'll ping you!

Project details

These details have been verified by PyPI

Maintainers

brendongo r2c r2c-drewdennison underyx

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.97.0

Nov 20, 2024

1.96.0

Nov 7, 2024

1.95.0

Oct 31, 2024

1.94.0

Oct 31, 2024

1.93.0

Oct 23, 2024

1.92.0

Oct 17, 2024

1.91.0

Oct 11, 2024

1.90.0

Sep 26, 2024

1.89.0

Sep 20, 2024

1.88.0

Sep 19, 2024

1.87.0

Sep 13, 2024

1.86.0

Sep 4, 2024

1.85.0

Aug 15, 2024

1.84.1

Aug 7, 2024

1.84.0

Aug 6, 2024

1.83.0

Aug 2, 2024

1.82.0

Jul 30, 2024

1.81.0

Jul 24, 2024

1.80.0

Jul 18, 2024

1.79.0

Jul 10, 2024

1.78.0

Jun 27, 2024

1.77.0

Jun 24, 2024

1.76.0

Jun 17, 2024

1.75.0

Jun 3, 2024

1.74.0

May 23, 2024

1.73.0

May 16, 2024

1.72.0

May 8, 2024

1.71.0

May 3, 2024

1.70.0

Apr 24, 2024

1.69.0

Apr 16, 2024

1.68.0

Apr 9, 2024

1.67.0

Mar 28, 2024

1.66.2

Mar 26, 2024

1.66.1

Mar 25, 2024

1.66.0

Mar 19, 2024

1.65.0

Mar 11, 2024

1.64.0

Mar 7, 2024

1.63.0

Feb 27, 2024

1.62.0

Feb 22, 2024

1.61.1

Feb 14, 2024

1.61.0

Feb 13, 2024

1.60.1

Feb 9, 2024

1.60.0

Feb 8, 2024

1.59.1

Feb 2, 2024

1.59.0

Jan 30, 2024

1.58.0

Jan 23, 2024

1.57.0

Jan 18, 2024

1.56.0

Jan 10, 2024

1.55.2

Jan 5, 2024

1.55.1

Jan 4, 2024

1.55.0

Jan 2, 2024

1.54.3

Dec 22, 2023

1.54.2

Dec 21, 2023

1.54.1

Dec 20, 2023

1.54.0

Dec 20, 2023

1.53.0

Dec 12, 2023

1.52.0

Dec 5, 2023

1.51.0

Nov 29, 2023

1.50.0

Nov 17, 2023

1.49.0

Nov 15, 2023

1.48.0

Nov 6, 2023

1.46.0

Oct 24, 2023

1.45.0

Oct 18, 2023

1.44.0

Oct 11, 2023

1.43.0

Oct 3, 2023

1.42.0

Sep 29, 2023

1.41.0

Sep 19, 2023

1.40.0

Sep 14, 2023

1.39.0

Sep 7, 2023

1.38.3

Sep 2, 2023

1.38.2

Sep 1, 2023

1.38.1

Sep 1, 2023

1.38.0

Aug 31, 2023

1.37.0

Aug 25, 2023

1.36.0

Aug 14, 2023

1.35.0

Aug 9, 2023

1.34.1

Jul 29, 2023

1.34.0

Jul 27, 2023

1.33.2

Jul 21, 2023

1.33.1

Jul 21, 2023

1.32.0

Jul 13, 2023

1.31.2

Jul 7, 2023

1.31.1

Jul 7, 2023

1.31.0

Jul 7, 2023

1.30.0

Jun 28, 2023

1.29.0

Jun 26, 2023

1.28.0

Jun 21, 2023

1.27.0

Jun 13, 2023

1.26.0

Jun 9, 2023

1.25.0

Jun 6, 2023

1.24.1

Jun 1, 2023

1.24.0

May 31, 2023

1.23.0

May 24, 2023

1.22.0

May 16, 2023

1.21.0

May 4, 2023

1.20.0

Apr 28, 2023

1.19.0

Apr 21, 2023

1.18.0

Apr 14, 2023

1.17.1

Apr 5, 2023

1.17.0

Apr 5, 2023

1.16.0

Mar 31, 2023

1.15.0

Mar 15, 2023

1.14.0

Mar 1, 2023

1.13.0

Feb 24, 2023

1.12.1

Feb 17, 2023

1.12.0

Feb 14, 2023

1.11.0

Feb 10, 2023

1.10.0

Feb 9, 2023

1.9.0

Feb 2, 2023

1.8.0

Feb 1, 2023

1.7.0

Feb 1, 2023

1.6.0

Jan 27, 2023

1.5.1

Jan 20, 2023

1.3.0

Jan 6, 2023

1.2.1

Dec 16, 2022

1.2.0

Dec 15, 2022

1.1.0

Dec 5, 2022

1.0.0

Dec 1, 2022

0.123.0

Nov 29, 2022

0.122.0

Nov 16, 2022

0.121.2

Nov 10, 2022

0.121.1

Nov 8, 2022

0.121.0

Nov 7, 2022

0.120.0

Nov 2, 2022

0.118.0

Oct 19, 2022

0.117.0

Oct 12, 2022

0.116.0

Oct 6, 2022

0.115.0

Sep 27, 2022

0.114.0

Sep 19, 2022

0.113.0

Sep 15, 2022

0.112.1

Sep 8, 2022

0.112.0

Sep 7, 2022

0.111.1

Aug 23, 2022

0.111.0

Aug 22, 2022

0.110.0

Aug 15, 2022

0.109.0

Aug 11, 2022

0.108.0

Aug 4, 2022

0.107.0

Jul 29, 2022

0.106.0

Jul 21, 2022

0.105.0

Jul 20, 2022

0.104.0

Jul 13, 2022

0.103.0

Jul 5, 2022

0.102.0

Jun 30, 2022

0.101.1

Jun 28, 2022

0.101.0

Jun 27, 2022

0.100.0

Jun 22, 2022

0.98.0

Jun 15, 2022

0.97.0

Jun 8, 2022

0.96.0

Jun 4, 2022

0.95.0

Jun 2, 2022

0.94.0

May 25, 2022

0.93.0

May 17, 2022

0.92.1

May 13, 2022

0.92.0

May 11, 2022

0.91.0

May 3, 2022

0.90.0

Apr 27, 2022

0.89.0

Apr 20, 2022

0.88.0

Apr 13, 2022

0.87.0

Apr 8, 2022

0.86.5

Mar 28, 2022

0.86.3

Mar 25, 2022

0.86.2

Mar 25, 2022

0.86.1

Mar 25, 2022

0.86.0

Mar 24, 2022

0.85.0

Mar 16, 2022

0.84.0

Mar 9, 2022

0.83.0

Feb 25, 2022

0.82.0

Feb 9, 2022

0.81.0

Feb 2, 2022

0.80.0

Jan 26, 2022

0.79.0

Jan 20, 2022

0.78.0

Jan 13, 2022

0.77.0

Dec 17, 2021

0.76.2

Dec 8, 2021

0.76.1

Dec 7, 2021

0.76.0

Dec 7, 2021

0.75.0

Nov 23, 2021

0.74.0

Nov 19, 2021

0.73.0

Nov 12, 2021

0.72.0

Nov 10, 2021

0.71.0

Nov 1, 2021

0.70.0

Oct 20, 2021

0.69.1

Oct 14, 2021

0.69.0

Oct 13, 2021

0.68.2

Oct 8, 2021

0.68.1

Oct 7, 2021

0.68.0

Oct 7, 2021

0.67.0

Sep 30, 2021

0.66.0

Sep 22, 2021

0.65.0

Sep 14, 2021

0.64.0

Sep 1, 2021

0.63.0

Aug 25, 2021

0.62.0

Aug 17, 2021

0.61.0

Aug 4, 2021

0.60.0

Jul 27, 2021

0.59.0

Jul 20, 2021

0.58.2

Jul 15, 2021

0.58.1

Jul 15, 2021

0.58.0

Jul 14, 2021

0.57.0

Jun 30, 2021

0.56.0

Jun 15, 2021

0.55.1

Jun 9, 2021

0.55.0

Jun 8, 2021

0.54.0

Jun 2, 2021

0.53.0

May 26, 2021

0.52.0

May 18, 2021

0.51.0

May 13, 2021

0.50.1

May 7, 2021

0.50.0

May 6, 2021

0.49.0

Apr 28, 2021

0.48.0

Apr 20, 2021

0.47.0

Apr 16, 2021

0.46.0

Apr 9, 2021

0.45.0

Mar 31, 2021

0.44.0

Mar 25, 2021

0.43.0

Mar 16, 2021

0.42.0

Mar 10, 2021

0.41.1

Feb 24, 2021

0.41.0

Feb 24, 2021

0.40.0

Feb 18, 2021

0.39.1

Jan 27, 2021

0.39.0

Jan 27, 2021

0.38.0

Jan 20, 2021

0.37.0

Jan 13, 2021

0.36.0

Jan 6, 2021

0.35.0

Dec 16, 2020

0.34.0

Dec 9, 2020

0.33.0

Dec 2, 2020

0.32.0

Nov 19, 2020

0.31.1

Nov 11, 2020

0.31.0

Nov 10, 2020

0.30.0

Nov 4, 2020

0.29.0

Oct 27, 2020

0.28.0

Oct 21, 2020

0.27.0

Oct 6, 2020

0.26.0

Sep 30, 2020

0.25.0

Sep 23, 2020

0.24.0

Sep 16, 2020

0.23.0

Sep 10, 2020

0.22.0

Sep 1, 2020

0.21.0

Aug 25, 2020

0.20.0

Aug 19, 2020

0.19.1

Aug 13, 2020

0.19.0

Aug 12, 2020

0.18.0

Aug 6, 2020

0.17.0

Jul 29, 2020

0.16.0

Jul 22, 2020

0.15.0

Jul 16, 2020

0.15.0b1 pre-release

Jul 15, 2020

0.14.0

Jul 8, 2020

0.13.0

Jul 1, 2020

This version

0.12.0

Jun 24, 2020

0.11.0

Jun 17, 2020

0.11.0b1 pre-release

Jun 17, 2020

0.10.1

Jun 10, 2020

0.10.0

Jun 10, 2020

0.9.0

Jun 3, 2020

0.8.1

May 26, 2020

0.8.0

May 21, 2020

0.8.0b1 pre-release

May 20, 2020

0.6.0

May 6, 2020

0.0.0

Feb 17, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semgrep-0.12.0.tar.gz (55.3 kB view details)

Uploaded Jun 24, 2020 Source

Built Distributions

semgrep-0.12.0-cp36.cp37.cp38.py36.py37.py38-none-manylinux1_x86_64.whl (2.3 MB view details)

Uploaded Jun 24, 2020 CPython 3.6 CPython 3.7 CPython 3.8 Python 3.6 Python 3.7 Python 3.8

semgrep-0.12.0-cp36.cp37.cp38.py36.py37.py38-none-macosx_10_14_x86_64.whl (1.9 MB view details)

Uploaded Jun 24, 2020 CPython 3.6 CPython 3.7 CPython 3.8 Python 3.6 Python 3.7 Python 3.8 macOS 10.14+ x86-64

File details

Details for the file semgrep-0.12.0.tar.gz.

File metadata

Download URL: semgrep-0.12.0.tar.gz
Upload date: Jun 24, 2020
Size: 55.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.0.0 requests-toolbelt/0.8.0 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for semgrep-0.12.0.tar.gz
Algorithm	Hash digest
SHA256	`18f87b1eba07997a6ab9ecf4c6c061c4b061f756502e3ab082bbf4ac04c5e3ca`
MD5	`14fe80acecc9a1c2e38f8d9c7c1a29ff`
BLAKE2b-256	`479ce0358eb5ca801925282f737b86ac25bbd9af8c91bef343039ad94c656ee3`

See more details on using hashes here.

File details

Details for the file semgrep-0.12.0-cp36.cp37.cp38.py36.py37.py38-none-manylinux1_x86_64.whl.

File metadata

Download URL: semgrep-0.12.0-cp36.cp37.cp38.py36.py37.py38-none-manylinux1_x86_64.whl
Upload date: Jun 24, 2020
Size: 2.3 MB
Tags: CPython 3.6, CPython 3.7, CPython 3.8, Python 3.6, Python 3.7, Python 3.8
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.0.0 requests-toolbelt/0.8.0 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for semgrep-0.12.0-cp36.cp37.cp38.py36.py37.py38-none-manylinux1_x86_64.whl
Algorithm	Hash digest
SHA256	`bf4eab3f696c4ba33bb6a2d300a747e64ccc23338a0586d1f9961af3897a3320`
MD5	`4089843c8a4606c39a673cf6677890cd`
BLAKE2b-256	`d02cfd578e644a7f4cb7fe0b973d824f27aa344a630ae09f837cf043dd444a79`

See more details on using hashes here.

File details

Details for the file semgrep-0.12.0-cp36.cp37.cp38.py36.py37.py38-none-macosx_10_14_x86_64.whl.

File metadata

Download URL: semgrep-0.12.0-cp36.cp37.cp38.py36.py37.py38-none-macosx_10_14_x86_64.whl
Upload date: Jun 24, 2020
Size: 1.9 MB
Tags: CPython 3.6, CPython 3.7, CPython 3.8, Python 3.6, Python 3.7, Python 3.8, macOS 10.14+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.0.0 requests-toolbelt/0.8.0 tqdm/4.46.1 CPython/3.7.7

File hashes

Hashes for semgrep-0.12.0-cp36.cp37.cp38.py36.py37.py38-none-macosx_10_14_x86_64.whl
Algorithm	Hash digest
SHA256	`79e91403ba1db64377d4e1dbbc6a2011dd7d7511806e106a4b172e5d6b5ad5c2`
MD5	`8c19d8d51547b4728b26926e60195754`
BLAKE2b-256	`37b0b38b3acf4209282cd2651d9fce84ebf4035c3c98b7e2e5b7d750e99a55f9`