Skip to content

Watch the video I made of my first short look HERE

Mistral has updated their LLM chat interface called Le Chat, which I understand is French for The Cat.

Anyway, it is a chat interface to their generative artificial intelligence model, much like OpenAI's ChatGPT, and Anthropic's Claude. You can read more about the new capabilities of Le Chat HERE.

All you need is a free account (for now). I do a quick first test of Le Chat to look at how it produces simple Python code. Le Chat does not allow for the upload of a spreadsheet file, so instead, I tried to use it to solve a simple system of two linear equations in two unknown.

I do stress test the chat interface a bit by using LaTeX in my prompts and I also get it to use the symbolic Python package, sympy. A package that I absolutely love, but is not that commonly used in the broader context of Python use-cases.

I copy and paste the code into Visual Studio Code (after having set up a virtual Python environment in which I installed numpy, matplotlib, and sympy beforehand).

Le Chat did a good job in this small first test. It generated the two lines in the plane using matplotlib to show the unique solution (at the intersection of the lines). It generated the augmented matrix as I instructed, but then solved the system of linear equations using the solve method in sympy. After instructing Le Chat to rather calculate the reduced row-echelon form of the matrix using the sympy rref method, it did indeed that.

Check out the new Le Chat for yourself HERE or watch the short video I made of my first test below.

An extension to ChatGPT Desktop App for MacOS

The GitHub Copilot extension in Visual Studio Code (VSC) provides a great pair-programming experience. This is especially true when using Jupyter notebooks and Python or R code blocks to generate short scripts (in code cells). This form of coding is great for research and teaching tasks.

The GitHub Copilot extension now also allows for the choice between OpenAI and Anthropic models, with Google models to follow soon. The GitHub Copilot extension opens in the right sidebar and allows us to chat with our favorite large language model (LLM). The image of the right sidebar in VSC below shows the GPT 4o model as the currently selected LLM.

The ChatGPT Desktop App for MacOS can now also integrate with VSC (and other programming interfaces such as the Terminal XCode. This negates the need to go back and forth between the ChatGPT Desktop App and the coding environment.

The integration requires the installation of an extension which can be downloaded from the OpenAI instruction page HERE. Once the required file is downloaded it must be installed as an extension in VSC. To do this, simply hold down the command and the shift keys and hit the P key. This opens the Command Palette in VSC. Start typing Extensions: Install from VSIX… and make command selection. Navigate to the downloaded file and install it.

When opening the ChatGPT Desktop App on a Mac, a new selection button will appear at the bottom of the page.

Connect to IDE
Connect to IDE

Clicking on this button will allows for the selection of code editors or terminal apps to integrate into your chat.

Add IDE
Add IDE

Now the ChatGPT Desktop App will be aware of the code being written in VSC, making for a better experience.

Anthropic releases their newest large language model, Claude 3, in three versions.
Image for this post

March 4 saw the release of Claude 3, the newest large language model (LLM) from Anthropic. Claude 3 competes with other LLMs such as GPT-4 in ChatGPT from OpenAI.

Claude 3.0 comes in three versions. Haiku (the smallest and fastest model, which is yet to be released as of this writing and is intended to be used as a corporate chatbot and other similar tasks), Sonnet (the free version and similar to GPT-3 from OpenAI), and Opus (paid). Opus is the largest model which invariably scores best on most benchmarks. It is not clear if the benchmarks in the release notes compares Claude 3 Opus against GPT-4 or the newer GPT-4 Turbo. The benchmarks (and release notes) can be found HERE. Take a closer look. It makes for interesting reading.

An exciting advancement is the larger context window in Claude 3. Anthropic's models already have some of the largest context windows. The context window, measured in tokens, allows for larger inputs (which includes prompts and other data) and the model's responses. As and example, a larger context window means that we can upload larger documents in our prompts. The model can interact with these documents when returning a response. It must be noted that the very largest context windows are only available to select customers.

A noted problem with large input windows is the needle in a haystack problem. This problem refers to LLMs inability to remember information in the middle of large inputs. Companies have devised tests for this problem. Such a needle-in-a-haystack test verifies a model's recall ability by inserting a target sentence (the needle) into the corpus of a document(s) (the haystack) and asks a question that can only be answered by using information in the needle. Company officials note surprise at how good the new Claude 3 model is at this test. Claude can actually recognize that it is being tested in this way and can include this in its response.

A new computer chip to speed up large language model responses.
groc company logo

Proprietary large language models (LLMs) such as GPT-4 and open-source models such as Llama 2 are trained on parallel processors such as graphics processing units provided by Nvidia. Similar processing architectures are used during inference. Here inference refers to the situation where the trained model is called upon to generate text by writing a prompt.

While parallel processors used in training are well-suited to the task of optimizing billions or even trillions of parameters, a different architecture is required for speedy inference. We have all experienced the slow typed response of OpenAI’s ChatGPT, Microsoft’s Copilot, and now Google’s Gemini interfaces.

Along comes the company, groq, with its language processing units (LPU). Groq claims to make inference 10-100 times faster. Try this for yourself at the groq home page. At the time of writing Croq allows testers on their site to choose between Mixtral 8x7B-32k and Llama 2 70B-4k.

A quicker response time by LLMs greatly enhances their usability. It feels more natural and interactive. This CNN video gives us a quick glimpse.

We will see faster inference in the future, that is for sure. Perhaps we will even see such chips imbedded in our own computers. LLMs are very large, though, so we will also need to see bigger storage and more memory.

This post is all about writing better prompts. A prompt is the input text that we write when chatting or otherwise communicating with a generative artificial intelligence model. In many cases, our default position is to be very terse when writing prompts. We have become accustomed to typing very short expressions into a search engine. This is not the case with generative models. They are designed to mimic communication and interaction with a human being. When we want detailed and precise responses from a real person that we are talking to, we are generally more exact in our own words. This is very much the approach we need to take when conversing with a generative model. We would all like generative artificial intelligence models such as ChatGPT to provide us with the perfect response. To do this, we really need to focus on our prompts. Our focus helps the model to return exactly what we need in a response or to set up the chat for a productive conversation.

I am your future ruler, uhm, I mean your friendly generative artificial intelligence model. What can I help you with?

Thinking about and writing proper prompts is now a recognized skill. Prompt engineering is the term that has develop to describe how to communicate effectively with generative models so that they understand our requirements and generate the best possible responses. A quick look at the job market now sees advertisements for prompt engineers. Some of these positions pay quite well.

Courses have been developed to teach prompt engineering and there are many quick tutorials on platforms such as YouTube. This post adds to the long list of help in writing better prompts. The field of generative artificial intelligence is changing at a rapid rate and I will no doubt return to this topic in the future. In this post, I take a quick look at some basic principles to keep in mind when writing a prompt. These principles are mainly used when we first initiate a new chat with a generative models. Subsequent prompts in a chat can indeed be more terse.

In short, a proper prompt should include the components in the list below. Take note that there is some order of importance (from most to least important) to this list and there is, at least to some extent, an overlap between the components.

  • The task that the generative artificial intelligence model should perform
  • The context in which the conversation is taking place
  • One or more examples pertaining the the prompt
  • The persona that the model should take on when responding
  • The format in which the model should respond
  • The tone of speech that the models should write in

We can think of constructing a prompt by writing content for the following placeholders.

[task] + [context] + [exemplar] + [persona] + [format] + [tone]

I would love it if you write your prompt like this. Sincerely, your generative AI model.

It is important to note that not every prompt needs all the information above. It is typical that only the prompt that initiates a chat should include as much information as possible. Below, we take a closer look at each of the components of a proper prompt.

It is usually a good idea to start the task with a verb. Examples include words such as Write ..., Create ..., Generate ..., Complete ... , Analyze ..., and so on. We should be as precise and direct as possible when writing the task. This is, after all, what we want the model to do. The task might refer to a single action, or indeed, many actions. An example might be the following sentence: Write the definition of the measures of central tendency and provide examples of such measures. This task starts with a verb and contains two instructions.

The context is not always easy to construct. In the case of our example we might add: I am a postgraduate student in public health and I am learning about biostatistics. This context can guide the generative model to return a response that can be used as learning material that should be easier to understand than a more technical response. Additional information such as: I am new to biostatistics or This is the first time I am learning statistics or I am reviewing this content for exam preparation, can be added to the context. The sky is the limit here. This is not to say, that we should overload the model with context. Just enough to guide the model when generating the response, usually performs wonders.

The quality and accuracy of prompts have been shown to increase decidedly when we include examples. Continuing with our reference to measures of central tendency, we might add the following: My textbook includes measures of central tendency such as the arithmetic and geometric mean, the median, and and the mode.

The persona helps the model to generate text in a specific framework. We might write the following: You are a University Professor teaching postgraduate level biostatistics. Clearly, the inclusion of this information should guide the model when generating its response. The response might be quite different if we add the the following persona: You are a middle school teacher or even You are Jedi Master Yoda.

Describing the format allows us to guide how the result should be generated. We might want a simple paragraph of text explaining the measures of central tendency or a bullet-point list of the names of the measures of central tendency and their definitions or a table with columns for measure of central tendency, definition, and example. We have to envision how we want the final result to be formatted. Note that we can also include this information in the examples that we provide in the prompt. The format also ties in with the task. We might want the model to write an essay about the topic or create study notes.

The tone of voice is not always required. We might want to include a specific tone of voice if we plan to use the content generated by the model as our own personal study notes or as formal content for an assignment (given that we have permission to use a model to complete our work or stipulate that we used a model to complete the work). Here we might also mention that we prefer the first or third-person perspective or even of the response should be humorous or very formal.

In human communication we can infer from context, voice intonation, facial expressions, verbal interactions, and much more to attain the information we require. In the case of a generative artificial intelligence model, we have to attempt the same thing, but with our words only. We actually have a lot of practice with this, having moved much of our interactions to email and chat applications. Now we are just chatting with a model.