Skip to content

Statistics. For many students, it may conjure images of complex formulas, abstract concepts, and daunting tables (depending, I suppose, on how old your professor is). Mastering statistics often comes with significant hurdles. Students frequently grapple with visualizing how probability density functions (PDFs) morph as their parameters change. The task of calculating critical values often involves cumbersome calculators or flipping through dense textbook appendices and can feel like a chore. Perhaps most crucially, translating these values into a clear understanding of rejection regions for hypothesis testing remains a persistent challenge.

What if there was a more intuitive way to engage with these concepts? What if creating interactive learning tools wasn't solely the domain of seasoned software developers? This is where the exciting potential of generative artificial intelligence (AI), and a new approach I call vibe coding, comes into play. In this post, I'll share how I used Google Gemini to effortlessly create a web application designed to help students visualize these very statistical challenges, all through the power of a detailed, natural language prompt.

For decades, instructing computers has meant learning their language. These languages include C++, Python, JavaScript, and a myriad of others. Mastering these languages presents a steep learning curve. Some might even argue that the inherent difficulty of learning to code signifies a fundamental limitation in how we've traditionally interacted with machines. Most students are on a path to become domain experts in their chosen fields, not necessarily computer scientists. So, how can they leverage the power of computation without diverting years to mastering complex syntax? Do you really want to learn .controls { background-color: #f8f9fa; padding: 20px; border-radius: 8px; margin-bottom: 20px; box-shadow: 0 2px 4px rgba(0,0,0,0.1); }?

Enter vibe coding, a term aptly coined by Andrej Karpathy. This isn't about learning a new programming language. Instead, it's about leveraging our language to instruct sophisticated AI models like Google Gemini. Google regularly updates its models, which consistently appear at the top of AI leaderboards, and one of their most impressive capabilities is code generation. With vibe coding, you describe the functionality, the appearance, and the behavior of what you want to create, and the AI translates that into functional code. This approach democratizes our ability to instruct computers, empowering anyone to build useful tools, like the statistics web app we're about to explore. It shifts the focus from how to code to what to create.

So, what did this vibe coding approach with Google Gemini actually produce? The result is a simple interactive web application designed to bring statistical distributions to life. The primary goal wasn't just to build an app, but to showcase how a complex, multi-faceted tool could be generated from a single, detailed natural language prompt.

The application consists of a central index.html page providing instructions and links to five distinct visualization pages. Each of these pages focuses on a specific probability distribution crucial for introductory statistics:

  1. The t-distribution (with variations for two-tailed, one-tailed lower, and one-tailed upper hypothesis tests)
  2. The chi-squared distribution
  3. The F-ratio distribution

On each distribution page, students can use intuitive sliders to adjust the parameters of the distribution (like degrees of freedom). A dropdown menu allows them to select the level of significance (alpha). Instantly, the curve of the probability density function updates, critical values are calculated and displayed, and, most importantly, the corresponding region(s) of rejection are visually highlighted on the plot. Each page also includes a simple link to return to the main index.

While the app itself is a valuable learning aid, its true significance in the context of this blog post is how it came to be as a direct output of instructing Google Gemini. It serves as a tangible testament to the capabilities of modern generative AI in translating detailed human intent into working software.

The magic of vibe coding lies in the quality and detail of the prompt you provide to the AI. For this statistics web app, I didn't write a single line of HTML, CSS, or JavaScript myself. Instead, I crafted one comprehensive prompt for Google Gemini. This wasn't a vague request. It was a detailed blueprint outlining the entire application. A copy of the prompt is printed at the end of this post.

By providing this level of structured detail, I was essentially having a conversation with Gemini, clearly articulating the vibe and precise requirements of the final application.

So, what happened after feeding this meticulously crafted prompt to Google Gemini? The outcome was, in my opinion, great. In a single generation pass, Gemini delivered all six HTML files as requested. There was no iterative back-and-forth, no debugging cryptic error messages, no wrestling with partial outputs. The code was, for all intents and purposes, faultless right out of the gate.

Even more impressively, Gemini made intelligent choices for the underlying technologies. It opted to use Plotly.js for rendering the interactive charts and jStat for the necessary statistical calculations (like probability density functions and inverse distribution functions to find critical values). These are robust, industry-standard libraries, and Gemini integrated them seamlessly to meet the prompt's requirements. Full disclosure. I use Plotly in Python all the time, but have never used the javascript version, and I have never used jStat.

The sliders for degrees of freedom and other parameters worked perfectly. The dropdowns for significance levels functioned as specified. The plots dynamically updated, the critical values were accurately calculated and displayed, and the rejection regions were correctly shaded. The web app performed exactly as intended by the prompt.

Seeing a complex set of instructions, spanning multiple interconnected files and involving sophisticated visual and mathematical logic, translated so perfectly into a functional application was a genuine wow moment. It underscored the immense potential of generative AI to act as a highly capable development partner.

This experience wasn't just about getting code. It was about witnessing an AI understand and execute a detailed vision, saving countless hours of traditional development, coding, and debugging. It points towards a significant shift in how we can create and interact with digital tools, especially in education.

Subject matter experts, like statistics instructors, can now envision and bring to life custom learning tools without needing extensive programming knowledge or relying on stretched developer resources. Even students with a clear idea for a helpful utility could potentially prototype it.

Imagine needing a slight variation of a tool, or a new visualization for a different concept. With vibe coding, generating or modifying such tools can be orders of magnitude faster than traditional development cycles. An idea can go from concept to functional prototype in remarkably short order.

For me, this opens the door to highly personalized learning experiences. Tools can be quickly tailored to specific curriculum needs, individual student challenges, or unique datasets, making learning more relevant and effective.

So, I encourage you to explore this yourself. If you have an idea for a tool, a visualization, or an application, try vibe coding it with a generative AI like Google Gemini. You might be surprised at what you can create. This isn't just about finding new ways to make games or simple scripts. It's about fundamentally changing how we interact with technology and empowering a new wave of creators and innovators across all fields.

    The full prompt

    Create the following six standalone webpages. 1. `index.html` 2. `tTwoTailed.html` 3. `tOneTailedLower.html` 4. `tOneTailedUpper.html` 5. `chiSquared.html` 6. `FRatio.html` Use the following instructions for each page. `index.html` should be have a white background. Add the title "VISUALIZING DISTRIBUTIONS" in black bold text in the right upper corner. Add a box below this with a blue-grey background. Add the subtitle "PDF Curve. Critical Value. Region of Rejection" in bold white text at the top left of the box. Add the text "Instructions" in bold white text below the subtitle in the box. Add the text "To view a distribution with it critical value(s) and region(s) of rejection, choose from the links below. Each link opens a webpage that shows a curve of the appropriate probability density function. Drag the required sliders to set the value of the parameter(s). Choose from 0.01, 0.05, and 0.1 to set the level of significance" below this in the box. Add an orange box at the left bottom of the larger blue-grey box. Fill it with bold white text that reads "Choose Below". Add the following links below the blue-grey box and link to the appropriate pages. 1. t distribution with a two-tailed alternative hypothesis 2. t distribution with a one-tailed alternative hypothesis (lower tail) 3. t distribution with a one-tailed alternative hypothesis (upper tail) 4. chi-squared distribution 5. F-ratio distribution `tTwoTailed.html` page is a white webpage. Add the title "t distribution". Add the subtitle "Two-tailed alternative hypothesis". Add a slider with the title "Degrees of Freedom". Set the minimum to 10 and the maximum to 100 in steps of 1. Add a drop-down box below the slider with the title "Level of Significance". Make the choices 0.01, 0.05, and 0.1. Let 0.05 be the default. Draw the curve of the probability density function below this. Keep the horizontal axis limits from -5 to 5 and the vertical axis limits from 0 to 0.45 constant at all times. Add small orange markers on the horizontal axis at the two values for the critical values calculated from the degrees of freedom and level of significance. Fill the area under the curve from -5 to the lower-tail critical value and from the upper-tail critical value to +5. Add a textbox below the curve with the title "Lower-Tail Critical Value:" and show the lower-tail critical value to three decimal places. Below this add a textbox with the title "Upper-Tail Critical Value:" and show the upper-tail critical value. Update the curve, the markers, the filled areas, and the critical values dynamically based on the values of the degrees of freedom slider and the choice of level of significance. Add a link back to `index.html` below with the title "...back to the home page". `tOneTailedLower.html` page is a white webpage. Add the title "t distribution". Add the subtitle "One-tailed alternative hypothesis (lower)". Add a slider with the title "Degrees of Freedom". Set the minimum to 10 and the maximum to 100 in steps of 1. Add a drop-down box below the slider with the title "Level of Significance". Make the choices 0.01, 0.05, and 0.1. Let 0.05 be the default. Draw the curve of the probability density function below this. Keep the horizontal axis limits from -5 to 5 and the vertical axis limits from 0 to 0.45 constant at all times. Add a small orange marker on the horizontal axis at the value for the critical values calculated from the degrees of freedom and level of significance. Fill the area under the curve from -5 to the lower-tail critical value. Add a textbox below the curve with the title "Lower-Tail Critical Value:" and show the lower-tail critical value to three decimal places. Update the curve, the marker, the filled area, and the critical value dynamically based on the values of the degrees of freedom slider and the choice of level of significance. Add a link back to `index.html` below with the title "...back to the home page". `tOneTailedUpper.html` page is a white webpage. Add the title "t distribution". Add the subtitle "One-tailed alternative hypothesis (upper)". Add a slider with the title "Degrees of Freedom". Set the minimum to 10 and the maximum to 100 in steps of 1. Add a drop-down box below the slider with the title "Level of Significance". Make the choices 0.01, 0.05, and 0.1. Let 0.05 be the default. Draw the curve of the probability density function below this. Keep the horizontal axis limits from -5 to 5 and the vertical axis limits from 0 to 0.45 constant at all times. Add a small orange marker on the horizontal axis at the value for the critical value calculated from the degrees of freedom and level of significance. Fill the area under the curve from the upper-tail critical value to +5. Add a textbox below the curve with the title "Upper-Tail Critical Value:" and show the upper-tail critical value to three decimal places. Update the curve, the marker, the filled area, and the critical value dynamically based on the values of the degrees of freedom slider and the choice of level of significance. Add a link back to `index.html` below with the title "...back to the home page". `chiSquared.html` page is a white webpage. Add the title "chi-squared distribution". Add a slider with the title "Degrees of Freedom". Set the minimum to 1 and the maximum to 8 in steps of 1. Add a drop-down box below the slider with the title "Level of Significance". Make the choices 0.01, 0.05, and 0.1. Let 0.05 be the default. Draw the curve of the probability density function below this. Keep the horizontal axis limits from 0 to 20 and the vertical axis limits from 0 to 0.35 constant at all times. Add a small orange marker on the horizontal axis at the value for the critical value calculated from the degrees of freedom and level of significance. Fill the area under the curve from the critical value to 20. Add a textbox below the curve with the title "Critical Value:" and show the critical value to three decimal places. Update the curve, the marker, the filled area, and the critical value dynamically based on the values of the degrees of freedom slider and the choice of level of significance. Add a link back to `index.html` below with the title "...back to the home page". `FRatio.html` page is a white webpage. Add the title "F-ratio distribution". Add a slider with the title "Numerator Degrees of Freedom". Set the minimum to 1 and the maximum to 5 in steps of 1. Below this add a slider with the title "Denominator Degrees of Freedom". Set the minimum to 15 and the maximum to 148, in steps of 1. Add a drop-down box below the slider with the title "Level of Significance". Make the choices 0.01, 0.05, and 0.1. Let 0.05 be the default. Draw the curve of the probability density function below this. Keep the horizontal axis limits from 0 to 10 and the vertical axis limits from 0 to 0.8 constant at all times. Add a small orange marker on the horizontal axis at the value for the critical value calculated from the degrees of freedom and level of significance. Fill the area under the curve from the critical value to 10. Add a textbox below the curve with the title "Critical Value:" and show the critical value to three decimal places. Update the curve, the marker, the filled area, and the critical value dynamically based on the values of the degrees of freedom slider and the choice of level of significance. Add a link back to `index.html` below with the title "...back to the home page".

    Just wasting some time on a Saturday morning

    stoichiometry

    I am not sure why, but I suddenly thought about stoichiometry (balancing chemical reactions) this morning. Surely it would make for a quick comparison between o1 and Claude 3.5 Sonnet.

    So, that just what I did. I do have paid access to both ChatGPT and Claude. My simple prompt to both models read: Show me how I can use a matrix and Gauss-Jordan elimination to solve stoichiometry problems.

    I share some screenshots and impressions below.

    We can start with 3.5 Sonnet. The response started well.

    First response from Claude.

    The chemical equation that the 3.5 Sonnet model came up with tried to balance the reaction between iron-oxide and carbon-monoxide, which produces iron and carbon-dioxide, a redox reaction as both oxidation of carbon monoxide and reduction of iron is taking place.

    3.5 Sonnet added variables and generated a set of three equations in four unknowns. It then proceeded to generate the correct augmented matrix. So far so good.

    Correct elementary row operations followed and the reduced row-echelon form was produced. Unfortunately, that's where the errors started. The 3.5 Sonnet model could not interpret the results from the final matrix as can be seen in the next image.

    Wrong interpretation of the results from the reduced row-echelon form of the matrix.

    I pointed out the error and the model then produced the correct result, although it still made an error with the equations using the least common multiple (bottom-middle of the image below).

    The o1 model from OpenAI faired much better, creating a reaction of propane and oxygen.

    o1 propane example.

    The o1 model set up the correct system of equations and created the augmented matrix. It performed elementary row operations and stopped when the matrix was in row-echelon form. The interpretation was correct as can be seen in the image below.

    Correct interpretation.

    This was just a single, simple comparison that added to my bias. I much prefer the o1 model from OpenAI, especially for coding and mathematics.

    An extension to ChatGPT Desktop App for MacOS

    The GitHub Copilot extension in Visual Studio Code (VSC) provides a great pair-programming experience. This is especially true when using Jupyter notebooks and Python or R code blocks to generate short scripts (in code cells). This form of coding is great for research and teaching tasks.

    The GitHub Copilot extension now also allows for the choice between OpenAI and Anthropic models, with Google models to follow soon. The GitHub Copilot extension opens in the right sidebar and allows us to chat with our favorite large language model (LLM). The image of the right sidebar in VSC below shows the GPT 4o model as the currently selected LLM.

    The ChatGPT Desktop App for MacOS can now also integrate with VSC (and other programming interfaces such as the Terminal XCode. This negates the need to go back and forth between the ChatGPT Desktop App and the coding environment.

    The integration requires the installation of an extension which can be downloaded from the OpenAI instruction page HERE. Once the required file is downloaded it must be installed as an extension in VSC. To do this, simply hold down the command and the shift keys and hit the P key. This opens the Command Palette in VSC. Start typing Extensions: Install from VSIX… and make command selection. Navigate to the downloaded file and install it.

    When opening the ChatGPT Desktop App on a Mac, a new selection button will appear at the bottom of the page.

    Connect to IDE
    Connect to IDE

    Clicking on this button will allows for the selection of code editors or terminal apps to integrate into your chat.

    Add IDE
    Add IDE

    Now the ChatGPT Desktop App will be aware of the code being written in VSC, making for a better experience.

    GENERATIVE AI text

    A Public Health Data Science* Perspective and Personal Opinion Statement

    *Data Science used in its broadest sense to include data collection, data management, (statistical) analysis, visualization, presentation, disseminations, and more.

    Human beings have long sought the power of intelligence from machines. From failed initial attempts such as expert decision systems, we have the current revolution driven by the mathematical function approach of machine learning.

    Machine learning itself has progressed dramatically. Today, we stand at the dawn of the age of generative artificial intelligence (GenAI) models. While virtually unknown just last Christmas, GenAI products such as ChatGPT are now household names. While previous machine learning techniques have had niche successes (and continues to do so - think self-driving cars), GenAI is different. It integrates with our normal day-to-day activities. We can communicate with it in our own language. We get answers in our own language. It assist with our daily tasks, makes our lives easier, and the bar to entry for its use, is very low. Simply consider the way that it is revolutionizing our web searches. No longer do we type terse sentences into a search engine text box only to get a million links that we have to sort through. Now we type a real sentence and get a real answer.

    GenAI models such as those of OpenAI’s ChatGPT and Google’s Bard are generative pre-trained (GPT) models. They are large language models (LLMs), having trained on enormous sets of data, be it written data on the internet, coding data, and more. They function by the simple concept of predicting every subsequent word in a sentence. LLMs perform so well, that their responses are coherent enough to have the illusion of intelligence.

    Exactly because of their illusion of intelligence combined with their general purpose, they have found a way to infiltrate so much of our daily lives. Interacting with web searches is but one example. GPT models can write essays, answer emails, create recipes, and so much more. Pertaining to our own domain, we have to state that they excel at working with data and at teaching. In other words, they excel at the core of our academic enterprises of teaching and research.

    Our first task is to accept our new reality. The proverbial horses have bolted. They will never be put back in the stable. 

    Our second task is to stop fearing artificial intelligence in general. While it is prudent to look to the future and safeguard that future, we should not be overwhelmed by fear. After all, we have to admit to the fear of the first motor car. To be sure, the moro vehicle has, and will continue to, kill human beings through road traffic accidents, but we cannot deny what it has meant to our society to be mobile. Humans have a long traditional of fearing the new. We have uncountable examples of how fear is weaponized to influence and control us. It happens to this day.

    As motorized and electrified transportation, modern medicine, communication devices, and so much more have benefitted us, despite the shortcomings of each and every example that can be added to this list, so it must be with GenAI. Instead of ignoring it or trying to ban it, we should embrace it, manage it, and use it to our advantage.

    The pace of evolution and revolution is staggering. It is not long ago that the term Data Science entered our collective awareness. Whereas probability and statistics are mature sciences, they are now the purview of the much wider world of data science. We have only recently introduced data science courses and programs into our academic teaching pursuits, and here we are, having to revisit what we are still busy creating, by having to incorporate GenAI models.

    Modern applied statistics (read biostatistic) such as is used in our School, is taught using software. We use software both in our teaching, but also actively in our research. The aim of our teaching efforts is to prepare our students for a modern working environment. That data science environment, inclusive of the software used, will without doubt make use of GenAI. With GenAI set to be a full component of the pipeline of data science, we are compelled to integrate it in our teaching.

    As a brand new component to data science and indeed to our daily lives, it would be impossible to lay out a complete plan at this time. What is clear, though, is that GenIA models excel at simplifying connected processes and at generating code, both tasks which are central to data science. As such, they allows us to focus on the tasks at hand instead of the minutiae of how to do the task. As simple example, we can consider exploratory data analysis. It is today, a trivial task to upload tabular data to OpenAi and ask ChatGPT to conduct a full set of exploratory data analysis and data visualization. Instead of having to learn the intricacies of performing these tasks and writing the code, we can instead concentrate on the information that the model produces. We remove the mundane tasks and replace it instead with our natural ability to use spoken language (which for now, we unfortunately have to type). The process extend naturally to statistical test and modeling, to model interpretation, and the presentation of results. GenAI models can do all of this, including generating reports and  summaries. They are the consummate research assistant.

    If GenAI models can be the consummate research assistant, then they can be the consummate assistant for curriculum design, educational resource design, and be a general teaching assistant. It is this last task that perhaps excites the most. Assuming the constructivist theory of learning that postulates that students construct versions of knowledge, building on pre-existing knowledge and experiences, rather than a behaviorist (change in behavior due to stimulus) or a cognitivist (instructional) approach, we can use GenAI to allow student to explore a knowledge space in a natural as opposed to prescribed way. The instantaneous response and always-available nature of GenAI allows a learner to engage with the knowledge space when and where they want and in a way that naturally occurs to them. They can ask follow up questions, view the results, and repeat this process until their curiosity is satiated. This cannot be mimicked by fixed educational materials and overburdened faculty.

    There are caveats to take cognizance of. First and foremost is the fact that some pre-existing knowledge is required to understand the responses of GenAI in the first place. As an example, stands code generation. GenAI models can produce code which can be copied into a coding application. With no knowledge of coding, a user will not understand the code, how it can be changed to be more efficient, or how to fix problems. Fortunately, GenAI models are excellent at explaining code and are the ideal tool for learning computer languages. In fact, it can be argued that they excel at it. We also have to admit that GenAI models are much more responsive, in fact infinitely more responsive, than the fixed written word of textbooks and other written or pre-recorded material.

    We also have to recognize the problem of hallucinations. They make mistakes. Perhaps we over-emphasize this problem, by subconsciously believing that human teachers make no mistakes at all. Even so, the responses of GenAI models are not peer-reviewed and they are not edited by a production team at a large publishing house. The real human-in-the-loop cannot be ignored here. The act of teaching still requires the active presence and involvement of a teacher. GenAI, though, is as argued before, the ideal assistant in the task of education.

    Another problem that we have to deal with is that of the use of GenAI models to cheat the system of assessment. Assessment is core to our academic enterprise. At times, we have to look at our own faults first, though. To some extent, much of academia has automated the process of assessment. Most of us are too overwhelmed with the tasks of being an academic to pay full and undivided attention to the level and adequacy of the  knowledge gained by our students. Instead, we have designed exams that are divorced from the real-world and are mere high-stakes hurdles that a student must navigate to prove the success of our system of education. Now, more so than ever before, we are tasked with improving our understanding of the knowledge level of our students. Many teachers, Schools, and Universities have done just this, incorporating individual interactions with students for continued  assessment, encouraging critical thinking and creativity, and above all, ethical awareness. Such systems, which we have already implemented, must be recognized and applauded.

    Automated policing, while admirable, is not the solution. It can be argued, in fact, to be a zero-sum game. As techniques are developed to identify the output of GenAI models, so systems are developed to overcome this detection. It might very well be a never-ending race. Added to this is the problem of negative flagging. The repercussions of false positives must be considered and may be devastating, for students and for teachers, and even for researchers.

    It is only be engaging with GenAI models in our teaching and learning that we will discover its true potential.

    Lorena Barba, Professor of Mechanical and Aerospace Engineering at The George Washington University, has written a tutorial on the use of generative artificial intelligence models in JupyterLab. Please follow the link below to view the instructive tutorial.

    Using Jupyter AI

    Many computer programs (integrated development environments) are available for writing code. Chief among these are the Jupyter environments. JupyterLab has arguably become the de facto standard software to use for your Python code. Now, you can link your account with providers of generative artificial intelligence models such as OpenAI (ChatGPT) or Anthropic (Claude) when using JupyterLab. You will get all the power of these models as chat agents and as coding assistants, right at your fingertips.

    This post is all about writing better prompts. A prompt is the input text that we write when chatting or otherwise communicating with a generative artificial intelligence model. In many cases, our default position is to be very terse when writing prompts. We have become accustomed to typing very short expressions into a search engine. This is not the case with generative models. They are designed to mimic communication and interaction with a human being. When we want detailed and precise responses from a real person that we are talking to, we are generally more exact in our own words. This is very much the approach we need to take when conversing with a generative model. We would all like generative artificial intelligence models such as ChatGPT to provide us with the perfect response. To do this, we really need to focus on our prompts. Our focus helps the model to return exactly what we need in a response or to set up the chat for a productive conversation.

    I am your future ruler, uhm, I mean your friendly generative artificial intelligence model. What can I help you with?

    Thinking about and writing proper prompts is now a recognized skill. Prompt engineering is the term that has develop to describe how to communicate effectively with generative models so that they understand our requirements and generate the best possible responses. A quick look at the job market now sees advertisements for prompt engineers. Some of these positions pay quite well.

    Courses have been developed to teach prompt engineering and there are many quick tutorials on platforms such as YouTube. This post adds to the long list of help in writing better prompts. The field of generative artificial intelligence is changing at a rapid rate and I will no doubt return to this topic in the future. In this post, I take a quick look at some basic principles to keep in mind when writing a prompt. These principles are mainly used when we first initiate a new chat with a generative models. Subsequent prompts in a chat can indeed be more terse.

    In short, a proper prompt should include the components in the list below. Take note that there is some order of importance (from most to least important) to this list and there is, at least to some extent, an overlap between the components.

    • The task that the generative artificial intelligence model should perform
    • The context in which the conversation is taking place
    • One or more examples pertaining the the prompt
    • The persona that the model should take on when responding
    • The format in which the model should respond
    • The tone of speech that the models should write in

    We can think of constructing a prompt by writing content for the following placeholders.

    [task] + [context] + [exemplar] + [persona] + [format] + [tone]

    I would love it if you write your prompt like this. Sincerely, your generative AI model.

    It is important to note that not every prompt needs all the information above. It is typical that only the prompt that initiates a chat should include as much information as possible. Below, we take a closer look at each of the components of a proper prompt.

    It is usually a good idea to start the task with a verb. Examples include words such as Write ..., Create ..., Generate ..., Complete ... , Analyze ..., and so on. We should be as precise and direct as possible when writing the task. This is, after all, what we want the model to do. The task might refer to a single action, or indeed, many actions. An example might be the following sentence: Write the definition of the measures of central tendency and provide examples of such measures. This task starts with a verb and contains two instructions.

    The context is not always easy to construct. In the case of our example we might add: I am a postgraduate student in public health and I am learning about biostatistics. This context can guide the generative model to return a response that can be used as learning material that should be easier to understand than a more technical response. Additional information such as: I am new to biostatistics or This is the first time I am learning statistics or I am reviewing this content for exam preparation, can be added to the context. The sky is the limit here. This is not to say, that we should overload the model with context. Just enough to guide the model when generating the response, usually performs wonders.

    The quality and accuracy of prompts have been shown to increase decidedly when we include examples. Continuing with our reference to measures of central tendency, we might add the following: My textbook includes measures of central tendency such as the arithmetic and geometric mean, the median, and and the mode.

    The persona helps the model to generate text in a specific framework. We might write the following: You are a University Professor teaching postgraduate level biostatistics. Clearly, the inclusion of this information should guide the model when generating its response. The response might be quite different if we add the the following persona: You are a middle school teacher or even You are Jedi Master Yoda.

    Describing the format allows us to guide how the result should be generated. We might want a simple paragraph of text explaining the measures of central tendency or a bullet-point list of the names of the measures of central tendency and their definitions or a table with columns for measure of central tendency, definition, and example. We have to envision how we want the final result to be formatted. Note that we can also include this information in the examples that we provide in the prompt. The format also ties in with the task. We might want the model to write an essay about the topic or create study notes.

    The tone of voice is not always required. We might want to include a specific tone of voice if we plan to use the content generated by the model as our own personal study notes or as formal content for an assignment (given that we have permission to use a model to complete our work or stipulate that we used a model to complete the work). Here we might also mention that we prefer the first or third-person perspective or even of the response should be humorous or very formal.

    In human communication we can infer from context, voice intonation, facial expressions, verbal interactions, and much more to attain the information we require. In the case of a generative artificial intelligence model, we have to attempt the same thing, but with our words only. We actually have a lot of practice with this, having moved much of our interactions to email and chat applications. Now we are just chatting with a model.