Access to Github Pages
Current architecture's limitations
- Created for one big client, not in line with our current needs
- Requires re-training for every new data added
- Training process hard to fully automized
- Categories and sub-categories are hard to define correctly
- Many problems arise when using datasets with overlapping meanings
Guide lines for an evolution
- Avoid specific trainings whenever it's possible
- Do not require huge amount of data to start working
- By default, no data leaks between projects
- Use categories as contextual helps instead of arbitrary silos
Ex: The question How many dead during the landing ?
in the absolute, is not equal to
How many dead during the landing of the bay of pigs ?
nor
How many dead during the landing of the D-Day ?
Basically, the aim of the new architecture is to build a NLP search engine with the ability to tell if an information is present or not.
Architecture Overview
Impacted Containers*
Not Docker! In the C4 model, a container represents an application or a data store.
A container is something that needs to be running in order for the overall software system to work.
Rasa(s)
Each project has its own instance of Maestro, so we must also have one Rasa instance per Maestro with a dedicated RabbitMQ queue
Ex: tbott-rpc.domain.project
Neural Networks
As explained in Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks paper, while pretrained BERT model perform poorly for sentence embeddings, BERT models fine-tuned on sentence matching are slow and resource hungry.
So to build an efficient search engine we must use two different models,
- one for sentence encoding that will be fed to an efficient search index such as FAISS to quickly output candidates,
- and another one for sentence matching to be able to make the difference between a best match and a true match
Sentence Encoder
- Encode sentence as a contextual vector
- ex: Universal Sentence Encoder Multilingual from Google
- Used in Faiss Index to find best candidates
- Trained only once
- Follows SOTA improvements
Sentence Matching
- Tells if two sentences are the same or not
- Provides relevance score
- Do not use sentence encoder's vectors
- Trained only once
- Follows SOTA improvements
Maestro
- Maestro is split in 3 parts for maintainability and efficiency.
- All 3 containers are grouped within the same Pod
- A Pod is dedicated to a Target (ex: prod or test) attached to a Project within a Domain
In this doc referred as Domain.Project.Target - For redundancy and scalability, multiple Pods can share the same Domain.Project.env
- All communications must be integrated within the Messaging library
- Messages between containers use JSON format containing Numpy arrays.
- Communications within a Pod is done through ZeroMQ on 127.0.0.1
-> Containers within a Pod share the same network namespace, enabling the use of localhost address.
-> ZeroMQ is fast, flexible and is indifferent to the startup order of clients and servers.
-> more or less act as an IPC within the Pod - Exemple of numpy serialisation with ZeroMQ: https://pyzmq.readthedocs.io/en/latest/serialization.html