Back to the blog

How Chatdesk Created Brandscript Tech Using Transformer Models to Ensure Perfectly On-Brand Customer Service

Sander Land
October 27, 2022
Chatdesk combines human agents with Natural Language Processing (NLP) to deliver fast and personalized customer service

With the power of technology combined with human care, BrandScript tech helps brands deliver on the promise of fast and personalized customer service. Don’t mistake it for a chatbot though - because we at Chatdesk believe customer service should prioritize the human connection first.

What (or who) is Chatdesk?

Chatdesk is a customer experience solution helping brands efficiently scale their customer support and drive conversions on social, email, chat, and SMS. Chatdesk. Our platform integrates with leading helpdesks like Zendesk and Gorgias

What makes Chatdesk unique is our community of 8,000+ trained US-based CX experts. We identify and train passionate fans of your company to supercharge your social and support channels so every response is written by a real person and always personal. In addition, the Chatdesk Experts use our BrandScript technology to ensure responses match your brand voice.

Doing this has allowed our clients to experience dramatic results. Instead of the usual 4-8 weeks to recruit and train a customer service agent, brands can get up and running in as little as 2-3 days. Not only that, our experts are able to convert 10-15% of customer conversations to sales through providing personalized interactions with customers - while achieving 90%+ customer satisfaction.

Let’s find out how it works.

How does BrandScript Tech work?

To help brands scale their customer experience (CX) efficiently with confidence, BrandScript was created as an in-house technology for Chatdesk CX experts to use. For every message from a customer, our platform provides up to 10 suggested responses a CX Expert can choose from. The suggested responses have 90%+ accuracy for social media messages and 70%+ accuracy for email. 

Experts then edit and personalize the message before sending it out. This ensures every response is 100% accurate and personalized to the customer.

Here’s a screenshot from our interface for the CX Experts.

The Chatdesk Expert interface showing suggested responses

How Chatdesk BrandScript uses Transformer Models

BrandScript uses a type of machine learning called transformer models. In the last few years, transformer models have taken the Natural Language Processing (NLP) world by storm. These models look at the entire input text at once, and are therefore better at understanding words and sentences in context, which is crucial for accurately ranking replies to complex queries. In addition, they can be pre-trained in a general way to understand many things about language, and then quickly fine-tuned for specific tasks. This reduces the amount of time and data needed to produce a high-quality model for a specific task. ref

An efficient siamese network for scoring responses

Chatdesk uses generative AI models for other tasks which we will share more about in the future. For BrandScript, we only suggest responses that have already been sent to customers in the past to ensure that they are 100% on brand. We use a classification based approach: given a query and a possible response, the model will score it between 0 and 1 as how likely it is to be a correct response. 

By using a siamese network, we can save the embeddings for the possible responses, and respond quickly despite the computational complexity of these models.

For the embeddings we use MiniLM6, which is relatively small and fast and optimized for sentence embeddings. As the language in the query and the reply can be quite different and we want to ensure that the classification is dependent on the particular brand, we add a classification head on top of the embeddings to combine the embeddings and information about message type and source.

Where embeddings u and v are embedding outputs of the same network.

We follow the optimal strategies outlined in Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks : taking embeddings as mean over all non-padding tokens, and adding absolute differences as features. This last option was tested and roughly halved training loss over just having both embeddings separately.

The classification head is small, allowing us to quickly run inference on hundreds of options as long as we have already pre-calculated the embeddings for each reply.

Retrieving rare replies

One weakness of the above approach is retrieving rare replies. Our cached set of recent responses is optimized for recent replies and variety, but does not capture every answer.

For example, one of our Fintech brands may get a customer question like:

We have just established a living trust and we would like to put our accounts into the trust. 

Here we have the answer in our database, but it is a much rarer question. A human agent would naturally look in the search option for keywords. In this case “trust” would be an important word here, which is also a very context dependent keyword. 

Detecting keywords in the message.

To improve our system, we give the model a second task: find keywords in the original query.  We train it on sufficiently rare words that are shared between query and reply. For example for the customer query, “Do you sell watercolor postcards?” with the reply “Unfortunately we don’t sell watercolor postcards” it would include ‘watercolor’ and possibly ‘postcards’.

Training the model on both targets not only helps the model understand our data better, but the resulting keywords can be used to retrieve a few extra replies to score in real-time.

The results clearly show the model does better than simply memorizing keywords, and can both score the same word differently based on context, as well as identify words it has never seen before.

We have just established a living trust and we would like to put our accounts into the trust. 

{'established': 0.260, 'trust': 0.998}

I don’t trust this message, it might be phishing. How can I protect my account?

{'trust': 0.154, 'protect': 0.070}

Here we see the word ‘trust’ is scored very high in the first case, but not in the second.

Do you sell tear resistant business cards with zentangle patterns?

{'tear': 0.673, 'resistant': 0.438, 'zentangle': 0.473, 'patterns': 0.120}

Despite there being no mention of ‘'zentangle’ in the entire data set, it is detected as a salient keyword to search for based on its context.


We use 10 incorrect responses for each correct answer, selecting random replies which are sufficiently different from the true reply and from each other, and train the model to simply classify correct vs incorrect replies. This allows us to side-step the complexity of training on relative scores. 


The main complexity of the system is in choosing an appropriate set of messages to pre-calculate embeddings on, and to deploy a system with such a cache. Describing this in detail is beyond the scope of this post, but our solution uses a small cache service on Kubernetes in combination with a REST API on Cloud Run, to give a system which is scalable, responsive, and low cost.

Join our team

Chatdesk is always looking for talented individuals who are ready to make a difference. If you're passionate about improving customer experience. Come join us!

If you’d like to pilot Chatdesk with your brand, you can schedule a demo here

Subscribe to our newsletter

Keep reading

Get a free quote.

Schedule a quick demo with our sales team.
Five out of five starsHello!

Download Content

Oops! Something went wrong while submitting the form.