Skip to main content

Trigram based algorithm for Addok.

Project description

Addok-trigrams

Alternative indexation pattern for Addok, based on trigrams.

Installation

pip install addok-trigrams

Configuration

In your local configuration file:

  • remove unwanted RESULTS_COLLECTORS_PYPATHS:

      from addok.config.default import RESULTS_COLLECTORS_PYPATHS
      RESULTS_COLLECTORS_PYPATHS.remove('addok.helpers.collectors.extend_results_reducing_tokens')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.only_commons_but_geohash_try_autocomplete_collector')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.no_meaningful_but_common_try_autocomplete_collector')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.only_commons_try_autocomplete_collector')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.autocomplete_meaningful_collector')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.fuzzy.fuzzy_collector')
    
  • remove all autocomplete and fuzzy RESULTS_COLLECTORS_PYPATHS, add new ones:

      RESULTS_COLLECTORS_PYPATHS += [
          'addok_trigrams.extend_results_removing_numbers',
          'addok_trigrams.extend_results_removing_one_whole_word',
          'addok_trigrams.extend_results_removing_successive_trigrams',
      ]
    
  • add trigramize to PROCESSORS_PYPATHS:

      from addok.config.default import PROCESSORS_PYPATHS
      PROCESSORS_PYPATHS += [
          'addok_trigrams.trigramize',
      ]
    
  • remove pairs and autocomplete indexers from INDEXERS_PYPATHS:

      from addok.config.default import INDEXERS_PYPATHS
      INDEXERS_PYPATHS.remove('addok.pairs.PairsIndexer')
      INDEXERS_PYPATHS.remove('addok.autocomplete.EdgeNgramIndexer')
    

By default, digit only words are not turned into trigrams. To prevent this, set TRIGRAM_SKIP_DIGIT=False.

Usage

Use addok batch just like with genuine addok for importing documents, but no need for running addok ngrams, given they are already part of the index strategy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

addok-trigrams-1.1.0.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

addok_trigrams-1.1.0-py3-none-any.whl (3.4 kB view details)

Uploaded Python 3

File details

Details for the file addok-trigrams-1.1.0.tar.gz.

File metadata

File hashes

Hashes for addok-trigrams-1.1.0.tar.gz
Algorithm Hash digest
SHA256 98225a1096002173e08fd1f57fe7dfada7201925e80dda0cfc04571de08cb078
MD5 c9d4ef241ffc3e00169fb7711b85e29c
BLAKE2b-256 cfbae67faf6c3a15bc841fb94a2d149707b2fd84ef67dae709661c68e2b077c2

See more details on using hashes here.

File details

Details for the file addok_trigrams-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for addok_trigrams-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d951755463b2670a1c9b91b38e157197bdd1775366b6bf0f8bf339a709354cf1
MD5 8ae0809e625f0eeb648d094b95dcd6eb
BLAKE2b-256 09af3d550c9e809daff62bbd0de047de68a6a774f79869ddc8207bd59be694be

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page