A fast tool to calculate Hamming distances
Project description
A small C++ tool to calculate pairwise distances between gene sequences given in fasta format.
Python interface
To use the Python interface, you should install it from PyPI:
python -m pip install hammingdist
Distances matrix
Then, you can e.g. use it in the following way from Python:
import hammingdist
# To see the different optional arguments available:
help(hammingdist.from_fasta)
# To import all sequences from a fasta file
data = hammingdist.from_fasta("example.fasta")
# To import only the first 100 sequences from a fasta file
data = hammingdist.from_fasta("example.fasta", n=100)
# To import all sequences and remove any duplicates
data = hammingdist.from_fasta("example.fasta", remove_duplicates=True)
# To import all sequences from a fasta file, also treating 'X' as a valid character
data = hammingdist.from_fasta("example.fasta", include_x=True)
# The distance data can be accessed point-wise, though looping over all distances might be quite inefficient
print(data[14,42])
# The data can be written to disk in csv format (default `distance` Ripser format) and retrieved:
data.dump("backup.csv")
retrieval = hammingdist.from_csv("backup.csv")
# It can also be written in lower triangular format (comma-delimited row-major, `lower-distance` Ripser format):
data.dump_lower_triangular("lt.txt")
retrieval = hammingdist.from_lower_triangular("lt.txt")
# If the `remove_duplicates` option was used, the sequence indices can also be written.
# For each input sequence, this prints the corresponding index in the output:
data.dump_sequence_indices("indices.txt")
# Finally, we can pass the data as a list of strings in Python:
data = hammingdist.from_stringlist(["ACGTACGT", "ACGTAGGT", "ATTTACGT"])
Distances from reference sequence
The distance of each sequence in a fasta file from a given reference sequence can be calculated using:
import hammingdist
distances = hammingdist.fasta_reference_distances(sequence, fasta_file, include_x=True)
This function returns a numpy array that contains the distance of each sequence from the reference sequence.
You can also calculate the distance between two individual sequences:
import hammingdist
distance = hammingdist.distance("ACGTX", "AAGTX", include_x=True)
OpenMP on linux
The latest versions of hammingdist on linux are now built with OpenMP (multithreading) support. If this causes any issues, you can install a previous version of hammingdist without OpenMP support:
pip install hammingdist==0.11.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for hammingdist-0.13.0-pp38-pypy38_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 247348a85667a9f1a6e4e9b1466f6f4a289f4ed2b8cf8ecb888c10b9675e0c91 |
|
MD5 | 83351b063b5455ff298fdda80ad7d5f6 |
|
BLAKE2b-256 | de15cda29089728e5a0ed44d1095f17c338b7fec0b9e064941ab5d6d3bd3a815 |
Hashes for hammingdist-0.13.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29efb804cd029efe7f79f42f8b24ca43f5bcbdc1a70f8da0a3947b754dab4e02 |
|
MD5 | cd35d687bb2e5be16b7882756fd729f4 |
|
BLAKE2b-256 | 8492bd5d6df355edd0cb47a0642ce4e2459617051197c5f6854ef321c838c7bd |
Hashes for hammingdist-0.13.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2830a369d049dfb4e588a57774239d846904812d28e7c59deed9dd4a371e0b22 |
|
MD5 | 6965eb5009046c3ccb2dd73c8927e132 |
|
BLAKE2b-256 | 9f568e76002e6b62648252239b393bc99a598684e75f46f945f5eec51082f0bf |
Hashes for hammingdist-0.13.0-pp37-pypy37_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d87430f52769eedb971dacc54cc59c09c240ff81de75cffcc3191bf9c34b83d5 |
|
MD5 | f32e00705f71b5e08c7cafe27771978a |
|
BLAKE2b-256 | 0e1546cbfceeb64f824b31b80b592216cfe5ef37138fc1936b2de71d08504d8c |
Hashes for hammingdist-0.13.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d051cd2e2dc1121ed950f25e7075297211e3c46ee76cfd762525dfda2c11039 |
|
MD5 | b351272437fc398c21edc703d1ac994d |
|
BLAKE2b-256 | c0f396dff1739e21fc3aa009eae50b6df77c339d7a6a48036365b95dcbb43c27 |
Hashes for hammingdist-0.13.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6bb2b3f912e103f616adccc97f861b4c53b6a74a95269637aada71f2442ebf66 |
|
MD5 | cf23930ce3445994d4e537b1f8afb40a |
|
BLAKE2b-256 | 79b4f9d64e76e372da68b933090baedccb736d1b1f58309288ed5a3438b7a992 |
Hashes for hammingdist-0.13.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 772a9decd15519578b395d3d467d652edc90ddc1bff6938ed593543f896cc828 |
|
MD5 | 5ab5c6bcc64644a1b3fdd427cad192e9 |
|
BLAKE2b-256 | 39de7efdd2bf38c2ef7122b65e73ada9918597e316bcb7667fac3669ef7d712c |
Hashes for hammingdist-0.13.0-cp310-cp310-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3ff6d4398ee956ce09f1fda81c7ce6ab2126d487175bbfe7e8d7f334bb9895ee |
|
MD5 | 0dad058c4190fefe43a3fbf970571f6e |
|
BLAKE2b-256 | 385da367512cdf43d98f14ca470c715d70b9a76917cde785108c9d24b840c387 |
Hashes for hammingdist-0.13.0-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e14fd28d4f96ee1ee53693fe05f270cb61a522d8b1e1fec30171012842d546b8 |
|
MD5 | 6d0aab5a528674c527c4406fa88daf7c |
|
BLAKE2b-256 | 5f99804c564768827ab8fb6837a7a1fab6e0ea2a10a7b941646a8e1bf4e2b275 |
Hashes for hammingdist-0.13.0-cp310-cp310-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69f02707b8e33a7f4e57331a9913cd350a5cc26f33b09fd67af6198a943587be |
|
MD5 | 47cc530c92bf6d7a656d177687ff6bf5 |
|
BLAKE2b-256 | 01f0acd245c02ce2a31348180498ee962c9a0d7efc595d6dff6af00f80b3255e |
Hashes for hammingdist-0.13.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb445d4c2b82aace75f7a5c79b0f25e18a616a8692f8d36f024d81b528eac3ae |
|
MD5 | c062a3e956c5baf411232d6cc16616a2 |
|
BLAKE2b-256 | bc9c19edb574ef50dd3c4fae2011845f0588de63698e6ceabbee2ec58692c8f4 |
Hashes for hammingdist-0.13.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9cb7e0006b9bb446c2b47d02a3bd7ba1be95fd7b3d456f60d83ad7c172901e8b |
|
MD5 | d0479900221a5c9eca96c0a21fb6c275 |
|
BLAKE2b-256 | 1c2798869c0a3ec5b20457d31eb896a3e2297cb3f220cb8fd9b45fa625edd2cd |
Hashes for hammingdist-0.13.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e35d09ff0b4f8ead1f56fc1f26764cacd2292dfdbd4730de5ed89a510bde0875 |
|
MD5 | 7085f403e97abc4936f9fd3b84d4d897 |
|
BLAKE2b-256 | 8c83805c2033cc980dd04e1112913790f454c1adebce0230ce08a8a695c0813f |
Hashes for hammingdist-0.13.0-cp39-cp39-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f86f4f6bc714b1098ce19e52d36ef90339ec2dd647110d0471720f6183788ee |
|
MD5 | bd485b264d3779ad7d0743a96de5a1f9 |
|
BLAKE2b-256 | c3d022b9532359fed8244a23de0c0a4e5530397407c3f5cac4f6f183eb141c07 |
Hashes for hammingdist-0.13.0-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5cdc3af5faf6402e5c631e42dfd2dbef7008c1b7c0f4899af4f7a4142f5f09bb |
|
MD5 | d3813d484618e89963c813f32fc2f71c |
|
BLAKE2b-256 | d25d4117fa6cb02241d24e300c6ed45b274fcd90899c6a36d26cc271dd1a97cf |
Hashes for hammingdist-0.13.0-cp39-cp39-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5e4291eeef8a575ec19895764710768ee425c3a3f92be38fddbf19fb0428833c |
|
MD5 | cb66b8d3f45c84a224bbe858ad365e35 |
|
BLAKE2b-256 | d601b570af82682c1eb69f607c3d51c87a388f8993988b070ab8ccb4f5ffb93d |
Hashes for hammingdist-0.13.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9c0df8f3619c2ba25bf462c455e47308085e1865afe8b1d4b956547cb2819881 |
|
MD5 | 1a8ba64c7e29cebee7508592beb753b5 |
|
BLAKE2b-256 | cedc06985f6bd2e106f145ef21474a31db468d04525d9824bd9eabbe1d0a9eda |
Hashes for hammingdist-0.13.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 785eadd4d2e0533bf5e82a5b6fb948a2786d9a42e2fecf01e1e0f34c09ada11b |
|
MD5 | 10e321b975ae76709bc73a139a1bb896 |
|
BLAKE2b-256 | b5874bfee319015237bb8d2c27726c62baf7320a7edcbee389ec3d2156bd5030 |
Hashes for hammingdist-0.13.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | db650fd08cf205112897af2040bc7e52716b643afffb98b4c4052c8012d9115d |
|
MD5 | 74d7ac3f4b45162bb0c8f95296b1219c |
|
BLAKE2b-256 | ee5d9d9afe2dccffb6f4b9edfb7b8dc6d7c9db5706330f84d8d63bb221490f6e |
Hashes for hammingdist-0.13.0-cp38-cp38-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4e2f33d3bf3828fc7cfcad7b666756b0ca4056fe9e3f762e0c20cc3be9913013 |
|
MD5 | c272c588ad64533ddcefef328bd05e38 |
|
BLAKE2b-256 | 0d88d1dbae5e268ae3e8a5a46257d660c5718318b02afeb693c7981f34d6bba1 |
Hashes for hammingdist-0.13.0-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3e6463c6d52a339539d542f2efec644073ff0e231ad1c116926852b7b9e8eaf0 |
|
MD5 | 267417eab1a45cc024e8b0ed9679cdbc |
|
BLAKE2b-256 | 1ee8cef2f83a58250dc00db6fee16b491e4ee85ae7d51dfa6e828f85206de99d |
Hashes for hammingdist-0.13.0-cp38-cp38-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fb8d8476ef7f8316a7480927d7e681146115881b41749919cb060eb67427a83e |
|
MD5 | 0d94f7a447ab59352ee0a7a4a2e4ba1a |
|
BLAKE2b-256 | c097c1ee393a031bb1c91af83cf7c7bac5abbbb767bad77ebfb3a4eb4fa455a8 |
Hashes for hammingdist-0.13.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9ab46eb62b1f485bb9799fdec969ff48908c296646c027a631003605a73262e6 |
|
MD5 | f18587aae673fdd098585299118ee260 |
|
BLAKE2b-256 | 8e7a4dd6e6f5ab1d052f2609c5541aa6e4a701055a53985218a1bcd426e5e76a |
Hashes for hammingdist-0.13.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6ec6edb2b97e1053b90c6eef74d6349ff11f756d96a4f7cf969045aa24ac8ffd |
|
MD5 | 612125e1ea1f29bd9584b01e118e0b43 |
|
BLAKE2b-256 | 8c401d529f1f3b5be1115212b836e1d75edc9312ca0a2da3a7ddb88825b0dc17 |
Hashes for hammingdist-0.13.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0b6e237a890c5c13b9f0c721dcac477a94f323dcd79867a8c21523e6523412c3 |
|
MD5 | 60fd765fc5c29d36b2be88b1ee9126dd |
|
BLAKE2b-256 | eca4c5b56ec0dd1c09ad159dcc943a74595b1b4f6a43833d97e70641d34973cc |
Hashes for hammingdist-0.13.0-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 43559b71f0b0bf0e8e92370c084d9296c9f76a0d78332b9c2a4f44b1c34450ea |
|
MD5 | 9a748586cf65d034a72732ad2661c798 |
|
BLAKE2b-256 | ede29d0444c2251e8a05a2e7abad1c2b678e3c220d9f7ca5bada2dc4f661c225 |
Hashes for hammingdist-0.13.0-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8f9fe5d5c98f3e8f6a480104284a03f666583214100b71aa4c657844afebbf8e |
|
MD5 | e4ca23759be51db46a2a4b4a91d5d543 |
|
BLAKE2b-256 | 38b40efe91f9d7d13aafeb9b2897556f95ea1a35aba494694c01164ab430e78c |
Hashes for hammingdist-0.13.0-cp37-cp37m-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e112f53c6da3984283f177c48d9264f8750ee7846bea0f8bc90c870c3686284 |
|
MD5 | 37d4f8f17729607b737a0177867c2c3c |
|
BLAKE2b-256 | 469b4c1038f421cd11d6215f3066e958cfc2ba2767682a4e5b9c3633523d3eeb |
Hashes for hammingdist-0.13.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 49325af17121e7b0a10da0b4f501c33e66295cb0541786b4114885d66011deb9 |
|
MD5 | e7d5e696d20c7f4089441b9c64270f97 |
|
BLAKE2b-256 | f030a9c4d49aaf3e762c099cd060cefe9ef7434e4c6bfccae52b235741460715 |
Hashes for hammingdist-0.13.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f5fe02561fc1dd7dea2d3572b6e931133ec4e15463c7d6d1d986a3d6b790763c |
|
MD5 | f523bdbfc0288fe132f3e23cad6f1b5e |
|
BLAKE2b-256 | d0a4813507ae90eac69c3b49dd2480d282abe8daf09828a33ddb132684ed0f38 |
Hashes for hammingdist-0.13.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 799f72a52392dc3e89efea863e9b46f45bceee513fd5bbad1209c6a3bb9fad6c |
|
MD5 | 23cb1128d300604e89e2ea05aca94b0d |
|
BLAKE2b-256 | 0d635045a287afc240477ae60c3b59fdac76767ba651ad89594293b8d2afaea9 |
Hashes for hammingdist-0.13.0-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1423fd891bdac5bef040968e18e5eb76825df3d0d05bfc9501759ce939e5940 |
|
MD5 | 96faa09a61b71b157e4dd3f1af22ae94 |
|
BLAKE2b-256 | d8a34f244583d465a01c1e2eeff9bed492210ca2accf4402a64fc08b3f200e53 |
Hashes for hammingdist-0.13.0-cp36-cp36m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 37fd1df8d10569c551317bef10a3d1256e2a8c51fcc78beda9e29dffdc4b4fda |
|
MD5 | cb2117502df8486e6748444760fed9c6 |
|
BLAKE2b-256 | f270c50dc10e05ff71d07a0600b34c1aeaa487afffb53a9f4d9eee7b864acd8c |
Hashes for hammingdist-0.13.0-cp36-cp36m-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 304bcd1f7079fadd40500ba2ddd9d616cc6c38fc3b5b0e9bd02c7a19a3bf77c8 |
|
MD5 | 0f3fdaa93b4a88a20ef11cc45ecf50d1 |
|
BLAKE2b-256 | bffe346e7ebeab49f0c4d844360c02fff740cfb6777ebbb2d15a9dbd704da962 |
Hashes for hammingdist-0.13.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a2816875cc2cb68279bf6eb689e8f9cf738f9a9e0d70bb6b3b88afe4866caa14 |
|
MD5 | 452f997ad54c1d26ada4bd647632e05d |
|
BLAKE2b-256 | 5f57382760ce7a79f738e20654eb0e430f1e1b1c6ac000a6110c4de6f04127c2 |
Hashes for hammingdist-0.13.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dbc5cb9a247abded8f209ac89d9c9f2556f3ae71b3f699bacf18e6c72d6422a3 |
|
MD5 | bb119c622593030b55da60052291c3ce |
|
BLAKE2b-256 | 91b7fd1116186a1952b3e5a4ea028463bbe862a9140684a4d0022163cac1908a |