AI Unpacked: How Your Business Can Benefit from Machine Learning
Before I get too deeply into embedding, let us acknowledge some things about AI and machine learning in this space. The first and most obvious is they can only work with numbers; more accurately, they are much more effective and efficient when using numbers.
In 2019, I did 100 days of Machine Learning, and the thing that took 80% of the effort was finding and manipulating the data into a form that could be used to train and test a machine learning model.
Next, what is a neural network? It is a computational model inspired by the human brain’s structure. It consists of interconnected nodes (analogous to neurons) that process information in layers. Neural networks are a subset of machine learning (ML), a branch of artificial intelligence (AI).
In the AI/ML landscape, neural networks are especially popular for tasks like image and speech recognition. When applied to language tasks, such as translation or sentiment analysis, it falls under the domain of Natural Language Processing (NLP).
Embedding is associated with NLP and is a technique where words or phrases are converted into vectors of numbers, allowing the computer to understand and process language. By representing words as vectors, we can capture their meaning and relationship with other words. As I said, this is crucial for tasks like text analysis, translation, and sentiment detection. Embeddings make it possible for machines to grasp the nuances of human language.
Embedding is mapping discrete or categorical variables to a vector of continuous numbers …
What does that actually mean you ask?! 🙂
Well, a discrete or categorical variable can take on one of a limited set of values, like days of the week or colours of a rainbow. It’s like choosing a flavour at an ice cream shop; you pick one specific option from the available choices.
And “a vector of continuous numbers”?!
I have got you … 🙂
Think of a row of water bottles lined up on a table. Each bottle can be filled to any level, not just full or empty. One might be half-full, another 1/4 full, and another almost to the top. A vector of continuous numbers is like a list of the water levels in each bottle. It captures the exact amount in each bottle, even if it’s somewhere in between full and empty.
Ok, so we have a collection of numbers but what does that mean with regard to embedding?
Imagine you visit a vast library with millions of books. Each book represents a unique category or topic. Now, if you had to describe each book’s content, it would be a lengthy process. Instead, what if you could summarise each book with a small set of keywords or a short summary? That would make things much more manageable!
So, each book represents a discrete or categorical variable. The set of keywords or short summaries for each book represents the embedding — a simpler, condensed representation of the book. Instead of describing the entire book, you use a vector (a list) of continuous numbers (the keywords or summary) to represent it; the genre is a simple example.
When we talk about neural networks, summarising each book is like training a neural network to understand and represent each category meaningfully.
Doing this reduces the library’s vastness (or dimensionality) to just a set of keywords or summaries, making it easier to work with and draw insights from.
In essence, embeddings help us take something vast and complex (like the content of a book) and represent it in a simpler, more digestible format.
OK, hopefully you are still with me! 🙂
So, how do embeddings help us? Here is an example of shallow embedding (ie simple).
Say I have three words:
I then convert all of these into embeddings. I then supply a fourth word to see if any of the above are similar. So let’s take “food” for example. Through some maths (cosine similarities, not important at this stage just for interest), we can determine which words are similar to “food” by comparing their vectors so that it would return “apple” and “pear”. Similarly, if I were to ask it for “fruit” I would get the same result but if I asked for “tool” I would only get the hammer back.
So now let’s imagine having a large number of documents in a business, Support documentation, Policy, Compliance etc and you wanted your team to be able to ask general questions about a topic, which this documentation covers, and get a meaningful “correct” answers back without having to trawl through every document… here enters embeddings.
The concept is exactly the same: we “chunk up” the documents, let’s say into 200-word blocks, and turn them into embeddings. Then, a team member can ask a question. We turn the question into an embedding and then find the “chunks” similar to the question.
Because embeddings are vectors, we are able to visualise them. Below is such a visualisation:
Each dot on the graph represents a single “chunk” (200 words) of embedding data. The question was “Give me a detailed overview of your DR and BCP policies”. The RED dot indicates the question embedding. The BLUE dots represent the closet similar chunks to my question related to.
Now sit and soak in that for a second …
What this means is you can have people ask a question and get the “correct” answer back! Now, I can hear the questions flying in, “they are just chunks of info” and “they aren’t useful in that form”!
If that were it, you would be correct.
So, what has been released in the last 6–12 months that has been the catalyst for so much innovation? Large Language Models, GPT in particular.
OpenAI has a very easy-to-use Embedding API endpoint to help you with embedding. You can then use GPT-4 to formulate a response.
So, to finish up, here is a basic prompt I have been playing with that has helped format the response from GPT-4.
I then inject the following:
In the coming weeks, I will share a little more on the HOW this is done technically, but hopefully, this has piqued your interest, and you can see the huge potential this could have for just about every business. There are already fully fledged services that do this for you out of the box, e.g. https://www.chatbase.co/ (disclaimer: I haven’t tested this one). I was interested in building one myself so I could see the end-to-end process.
Stay tuned for the next article from my learnings on this 100 days of AI journey. 🙂
Originally published at https://justinhennessy.com.