35 projects
opencivicdata
python opencivicdata library
probablepeople
Parse romanized names & companies using advanced NLP methods
usaddress
Parse US addresses using conditional random fields
parserator
Create parsers
census-area
Census data for arbitrary geographies
dedupe
A python library for accurate and scaleable data deduplication and entity-resolution
django-councilmatic
Core functions for councilmatic.org family
dedupe-variable-address
Address variable type for dedupe
dedupe-variable-datetime
DateTime variable type for dedupe
dedupe-variable-name
Name variable type for dedupe
parseratorvariable
Structured variable type for dedupe
pyhacrf-datamade
Hidden alignment conditional random field, a discriminative string edit distance
PyLBFGS
LBFGS and OWL-QN optimization algorithms
census
A wrapper for the US Census Bureau's API
pupa
scraping framework for muncipal data
affinegap
A Cython implementation of the affine gap string distance
rlr
Case weighted L2 regularized logistic regression
DoubleMetaphone
Python wrapper for C++ Double Metaphone
dedupe-hcluster
Hierarchical Clustering Algorithms (Information Theory)
dedupe-variable-ilcs
Dedupe variable for Illinois Compiled Statute (ILCS) codes
ilcs-parser
Probabilistic parser for tagging data that references the Illinois Compiled Statutes (ILCS).
csvdedupe
Command line tools for deduplicating and merging csv files
dedupe-variable-number
Employer variable type for dedupe
django-councilmatic-notifications
Core functions for councilmatic.org family
datetime-distance
Compare string distances between dates, timestamps, or datetime objects.
simplecosine
Simple cosine distance
highered
Learnable Edit Distance Using PyHacrf
categorical-distance
Compare two categorical variables
dedupe-variable-person
Variable type for American Person Names
companyparser
UNKNOWN
probableparsing
Common methods for propbable parsers
dedupe-variable-employer
Employer variable type for dedupe
dedupe-variable-fuzzycategory
Fuzzy Categoy variable type for dedupe
fuzzycategory
A context comparison
canonicalize
canonicalize a cluster of records