"An open framework and dataset for building a distributed-agent chatbot based on _Natural Language Processing in Action_."
Project description
qary
Use NLP in Action to build a virtual assistant that actually assists! Most bots manipulate you to make money for their corporate masters. Your bot can help protect you and amplify your abilities and prosocial instincts.
This hybrid chatbot combines 4 techniques explained in NLP in Action:
1. search: [chatterbot](https://github.com/gunthercox/ChatterBot), [will](https://github.com/skoczen/will) 2. pattern matching and response templates: Alexa, [AIML](https://github.com/keiffster/program-y) 3. generative deep learning: [robot-bernie](https://github.com/nlpia/robot-bernie), [movie-bot](https://github.com/totalgood/nlpia/blob/master/src/nlpia/book/examples/ch10_movie_dialog_chatbot.py) 4. grounding: [snips](https://github.com/snipsco/snips-nlu)
The presentations for San Diego Python User Group are in docs/
Install
You’ll want to install and use the conda package manager within Anaconda3, especially if your development environment is not a open standard operating system like Linux.
git clone git@github.com:nlpia/qary
cd qary
conda env create -n nlpia -f environment.yml # or environment-windoze.yml
conda activate nlpia
pip install --editable .
Usage
$ bot --help
usage: bot [-h] [--version] [--name STR] [-p] [-b STR] [-v] [-vv]
[words [words ...]]
Command line bot application, e.g. bot how do you work?
positional arguments:
words Words to pass to bot as an utterance or conversational
statement requiring a bot reply or action.
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
--name STR IRC nick or CLI command name for the bot
-p, --persist Don't exit. Retain language model in memory and
maintain dialog until user says 'exit', 'quit' or 'bye'
(this is the default behavior if you do not provide a statement)
-b STR, --bots STR comma-separated list of bot personalities to load
default: pattern,parul,search_fuzzy,time,eliza
-v, --verbose set loglevel to INFO
-vv, --very-verbose set loglevel to DEBUG
Examples
You can run bot just like any other command line app, giving it your statement/query as an argument.
$ bot hello
# 2019-11-21 12:42:13,620 WARNING:nlpia.constants:107: <module> Starting logger in nlpia.constants...
# 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 64350/64350 [00:00<00:00, 540679.58it/s]
BOT: Hello
And if you want quicker turnaround on your bot you can just run it in peristent mode (without any positional arguments for your statement or words).
Travis’s probabilistic reply selector is working nicely to chose a reply from multiple sources (default settings):
pattern_bots.py: regex patterns and greeting templates
fuzzy_search_bots.py: movie dialog fuzzy matching
parul_bots.py: Wikipedia searches using conventional TFIDF like a search engine
eliza_bots.py: A python port of the ELIZA therapist bot
time_bots.py: A time and productivity tracker that parses your git logs and bash history
$ bot
(nlpia) hobs@Hobsons-MacBook-Air:~/code/chatbot/qary$ bot
# 2019-11-21 12:59:05,854 WARNING:nlpia.constants:107: <module> Starting logger in nlpia.constants...
# 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 64350/64350 [00:00<00:00, 495935.48it/s]
YOU: Hi
BOT: hey there. tired of breathing?
YOU: Hello
BOT: How do you do. Please state your problem.
YOU: What is an ELIZA chatbot?
BOT: joey never told you we went out, did he?
YOU: ^[[A
BOT: I am sorry! I dont understand you
YOU: What is an ELIZA chatbot?
BOT: Does that question interest you ?
YOU: What is a chatbot?
BOT: hello barbie is an internet-connected version of the doll that uses a chatbot provided by the company toytalk, which previously used the chatbot for a range of smartphone-based characters for children.
YOU: Hello
BOT: hello.
YOU: Hello
BOT: How do you do. Please state your problem.
YOU: bye
$
Work in Progress
Travis (@travis-harper): model management, context filtering, and the addition of more conversational agents
Nima (@hulkgeek): question answering bot based on his state of the art question classifier
Xavier (@spirovanni): employment counselor for workforce.org and the city of San Diego
Hobson (@hobson): infrastructure (CI, webapp) and framework features (nltk->spacy, USE vectors)
Erturgrul: Turkish wikipedia QA bot (parul bot)
You: What big chatbot idea would you like to make a reality?
Ideas
Please submit your feature ideas github issues. Here are a few ideas to get you started.
movie dialog in django database to hold the statement->response pairs
graph schema compatible with MxGraph (draw.io) and other js libraries for editing graphs/flow charts.
ubuntu dialog corpus in db
mindfulness faq corpus in db
famous quotes as responses to the statement “tell me something inspiring”
jokes for “tell me a joke”
data science faq
nlpia faq
psychology/self-help faq
html django template so there is a web interface to the app rather than just the command line command bot
use Django Rest Framework to create a basic API that returns json containing a reply to any request sent to the local host url, like http://localhost:8000/api?statement='Hello world' might return {‘reply’: ‘Hello human!’}
have the command line app use the REST API from #3 rather than the slow reloading of the csv file every time you talk to the bot
use database full text search to find appropriate statements in the database that we have a response for
use semantic search instead of text similarity (full text search or fuzzywyzzy text matches)
add embedding vectors (300D document vectors from spacy) to each statement and response in the db
create a semantic index of the document vectors using annoy so “approximate nearest neighbors” (semantic matches) can be found quickly
load the annoy index of the document vectors every time the server is started and use it to find the best reply in the database.
use universal sentence encodings instead of docvecs from spacy.
create a UX for dialog graph creation/design:
install mxgraph in the django app
create a basic page based on this mxgraph example so the user can build and save dialog to the db as a graph: tutorial, example app
convert the dialog graph into a set of records/rows in the qary db so it acts
tag different dialog graphs in the db so the user can turn them on/off for their bot
allow the user to prioritize some dialogs/models over others
allow the user to create their own weighting function to prioritize individual statements produced by the api
train a character-based generative model
decoder half of autoencoder to generate text based on docvecs from spacy
decoder part of autoencoder to generate text based on universal sentence encodings
train model to generate reply embeddings (doc vecs and/or use vecs) using statement embeddings (dialog engine encoder-decoder using docvecs or use vecs for the encoder half
add a therapy/mindfulness-coach feature to respond with mindfulness ideas to some queries/statements
add the “translate ‘this text’ to spanish” feature
train character-based LSTM models on english-spanish, english-french, english-german, english<->whatever
add module for this to the django app/api
AIML engine fallback
Inspiration
A lot of the patterns and ideas were gleaned from other awesome prosocial chatbots and modular open source frameworks.
Mental Health Coaches
WYSA from London is free
open source (touchkin)?
ionic?
passive sensing of sleep patterns (accelerometers?)
guided meditation
exercise suggestions
free text dialog with buttons to suggest replies
based on open source touchkin/mindlogger ?
Replika from US is paywalled
personality profile test
pay to unlock “skills” training
Youper (thank you Maria and tangibleai.com)
Open Source Frameworks
-
lang: python
web: zeromq
db: redis, couchbase, flat file, user-defined
integrations: hipchat, rocketchat, shell, slack
-
lang: python
web: flask
orm: flask?
db: mongodb
nice general json syntax for specifying intent/goals for conversation manager (agent)
-
lang: python
web: sanic (async)
orm: sqlalchemy
db: sqlite
rich, complex, mature framework
-
javascript (typescript)
meta-framework allowing your to write your own modules in javascript
-
python
web: flask (rest), sanic (async)
db: aiml flat files (XML)
integrations: facebook messenger, google search, kik, line, alexa, webchat, viber
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.