[Week 2] Prompting and Prompt Engineering

ETMI5: Explain to Me in 5

This part of our course content covers the intricacies of prompting and prompt engineering for LLMs. Prompting is the technique of crafting precise instructions to elicit specific responses from LLMs, crucial for their effective use.

Prompt engineering is an evolving discipline aimed at optimizing these prompts to enhance model performance across various tasks. The importance of prompting lies in its ability to guide LLMs towards producing contextually appropriate and accurate outputs, leveraging their training and understanding of language patterns.

The content also touches on the challenges, including potential biases and the risk of hallucination, and the need for techniques to detect and mitigate such issues.

Additionally, we briefly go over tools and advanced methods developed for prompt engineering, underscoring the dynamic and collaborative nature of this field in harnessing LLM capabilities.

Introduction

Prompting

"prompting" refers to the art and science of formulating precise instructions or queries provided to the model to generate desired outputs. It's the input—typically in the form of text—that users present to the language model to elicit specific responses. The effectiveness of a prompt lies in its ability to guide the model's understanding and generate outputs aligned with user expectations.

Prompt Engineering

Prompt engineering, a rapidly growing field, revolves around refining prompts to unleash the full potential of Language Models in various applications.
In research, prompt engineering is a powerful tool, enhancing LLMs' performance across tasks like question answering and arithmetic reasoning. Users need to leverage these skills to create effective prompting techniques that seamlessly interact with LLMs and other tools.
Beyond crafting prompts, prompt engineering is a rich set of skills essential for interacting and developing with LLMs. It's not just about design; it's a crucial skill for understanding and exploiting LLM capabilities, ensuring safety, and introducing novel features like domain knowledge integration.
This proficiency is vital in aligning AI behavior with human intent. While professional prompt engineers delve into the complexities of AI, the skill isn't exclusive to specialists. Anyone refining prompts for models like ChatGPT is engaging in prompt engineering, making it accessible to users exploring language model potentials.

Untitled

                Image Source: [<https://zapier.com/blog/prompt-engineering/>](<https://zapier.com/blog/prompt-engineering/>)

Why Prompting?

Large language models are trained through a process called unsupervised learning on vast amounts of diverse text data. During training, the model learns to predict the next word in a sentence based on the context provided by the preceding words. This process allows the model to capture grammar, facts, reasoning abilities, and even some aspects of common sense.

Prompting is a crucial aspect of using these models effectively. Here's why prompting LLMs the right way is essential:

Contextual Understanding: LLMs are trained to understand context and generate responses based on the patterns learned from diverse text data. When you provide a prompt, it's crucial to structure it in a way that aligns with the context the model is familiar with. This helps the model make relevant associations and produce coherent responses.
Training Data Patterns: During training, the model learns from a wide range of text, capturing the linguistic nuances and patterns present in the data. Effective prompts leverage this training by incorporating similar language and structures that the model has encountered in its training data. This enables the model to generate responses that are consistent with its learned patterns.
Transfer Learning: LLMs utilize transfer learning. The knowledge gained during training on diverse datasets is transferred to the task at hand when prompted. A well-crafted prompt acts as a bridge, connecting the general knowledge acquired during training to the specific information or action desired by the user.
Contextual Prompts for Contextual Responses: By using prompts that resemble the language and context the model was trained on, users tap into the model's ability to understand and generate content within similar contexts. This leads to more accurate and contextually appropriate responses.
Mitigating Bias: The model may inherit biases present in its training data. Thoughtful prompts can help mitigate bias by providing additional context or framing questions in a way that encourages unbiased responses. This is crucial for aligning model outputs with ethical standards.

To summarize, the training of LLMs involves learning from massive datasets, and prompting is the means by which users guide these models to produce useful, relevant, and policy-compliant responses. It's a collaborative process where users and models work together to achieve the desired outcome. There’s also a growing field called adversarial prompting which involves intentionally crafting prompts to exploit weaknesses or biases in a language model, with the goal of generating responses that may be misleading, inappropriate, or showcase the model's limitations. Safeguarding models from providing harmful responses is a challenge that needs to be solved and is an active research area.

Prompting Basics

The basic principles of prompting involve the inclusion of specific elements tailored to the task at hand. These elements include:

Instruction: Clearly specify the task or action you want the model to perform. This sets the context for the model's response and guides its behavior.
Context: Provide external information or additional context that helps the model better understand the task and generate more accurate responses. Context can be crucial in steering the model towards the desired outcome.
Input Data: Include the input or question for which you seek a response. This is the information on which you want the model to act or provide insights.
Output Indicator: Define the type or format of the desired output. This guides the model in presenting the information in a way that aligns with your expectations.

Here's an example prompt for a text classification task:

Prompt:

Classify the text into neutral, negative, or positive
Text: I think the food was okay.
Sentiment:

In this example:

Instruction: "Classify the text into neutral, negative, or positive."
Input Data: "I think the food was okay."
Output Indicator: "Sentiment."

Note that this example doesn't explicitly use context, but context can also be incorporated into the prompt to provide additional information that aids the model in understanding the task better.

It's important to highlight that not all four elements are always necessary for a prompt, and the format can vary based on the specific task. The key is to structure prompts in a way that effectively communicates the user's intent and guides the model to produce relevant and accurate responses.

OpenAI has recently provided guidelines on best practices for prompt engineering using the OpenAI API. For a detailed understanding, you can explore the guidelines here, the below points gives a brief summary:

Use the Latest Model: For optimal results, it is recommended to use the latest and most capable models.
Structure Instructions: Place instructions at the beginning of the prompt and use ### or """ to separate the instruction and context for clarity and effectiveness.
Be Specific and Descriptive: Clearly articulate the desired context, outcome, length, format, style, etc., in a specific and detailed manner.
Specify Output Format with Examples: Clearly express the desired output format through examples, making it easier for the model to understand and respond accurately.
Use Zero-shot, Few-shot, and Fine-tune Approach: Begin with a zero-shot approach, followed by a few-shot approach (providing examples). If neither works, consider fine-tuning the model.
Avoid Fluffy Descriptions: Reduce vague and imprecise descriptions. Instead, use clear instructions and avoid unnecessary verbosity.
Provide Positive Guidance: Instead of stating what not to do, clearly state what actions should be taken in a given situation, offering positive guidance.
Code Generation Specific - Use "Leading Words": When generating code, utilize "leading words" to guide the model toward a specific pattern or language, improving the accuracy of code generation.

💡It’s also important to note that crafting effective prompts is an iterative process, and you may need to experiment to find the most suitable approach for your specific use case. Prompt patterns may be specific to models and how they were trained (architecture, datasets used etc.)

Explore these examples of prompts to gain a better understanding of how to craft effective prompts in different use-cases.

Advanced Prompting Techniques

Prompting techniques constitute a rapidly evolving area of research, with researchers continually exploring novel methods to effectively prompt models for optimal performance. The simplest forms of prompting include zero-shot, where only instructions are provided, and few-shot, where examples are given, and the language model (LLM) is tasked with replication. More intricate techniques are elucidated in various research papers. While the provided list is not exhaustive, existing prompting methods can be tentatively classified into high-level categories. It's crucial to note that these classes are derived from current techniques and are not exhaustive or definitive; they are subject to evolution and modification, reflecting the dynamic nature of advancements in this field. It's important to highlight that numerous methods may fall into one or more of these classes, exhibiting overlapping characteristics to get the benefits offered by multiple categories.

Applied LLMs (11).png

A. Step-by-Step Modular Decomposition

These methods involve breaking down complex problems into smaller, manageable steps, facilitating a structured approach to problem-solving. These methods guide the LLM through a sequence of intermediate steps, allowing it to focus on solving one step at a time rather than tackling the entire problem in a single step. This approach enhances the reasoning abilities of LLMs and is particularly useful for tasks requiring multi-step thinking.

Examples of methods falling under this category include:

Chain-of-Thought (CoT) Prompting:

Chain-of-Thought (CoT) Prompting is a technique to enhance complex reasoning capabilities through intermediate reasoning steps. This method involves providing a sequence of reasoning steps that guide a large language model (LLM) through a problem, allowing it to focus on solving one step at a time.

In the provided example below, the prompt involves evaluating whether the sum of odd numbers in a given group is an even number. The LLM is guided to reason through each example step by step, providing intermediate reasoning before arriving at the final answer. The output shows that the model successfully solves the problem by considering the odd numbers and their sums.

Untitled

                                      Image Source: [Wei et al. (2022)](<https://arxiv.org/abs/2201.11903>)

1a. Zero-shot/Few-Shot CoT Prompting:

Zero-shot involves adding the prompt "Let's think step by step" to the original question to guide the LLM through a systematic reasoning process. Few-shot prompting provides the model with a few examples of similar problems to enhance reasoning abilities. These CoT methods prompt significantly improves the model's performance by explicitly instructing it to think through the problem step by step. In contrast, without the special prompt, the model fails to provide the correct answer.

Untitled

                                    Image Source: [Kojima et al. (2022)](<https://arxiv.org/abs/2205.11916>)

1b. Automatic Chain-of-Thought (Auto-CoT):

Automatic Chain-of-Thought (Auto-CoT) was designed to automate the generation of reasoning chains for demonstrations. Instead of manually crafting examples, Auto-CoT leverages LLMs with a "Let's think step by step" prompt to automatically generate reasoning chains one by one.

Untitled

                             Image Source: [Zhang et al. (2022)](<https://arxiv.org/abs/2210.03493>)

The Auto-CoT process involves two main stages:

Question Clustering: Partition questions into clusters based on similarity.
Demonstration Sampling: Select a representative question from each cluster and generate its reasoning chain using Zero-Shot-CoT with simple heuristics.

The goal is to eliminate manual efforts in creating diverse and effective examples. Auto-CoT ensures diversity in demonstrations, and the heuristic-based approach encourages the model to generate simple yet accurate reasoning chains.

Overall, these CoT prompting techniques showcase the effectiveness of guiding LLMs through step-by-step reasoning for improved problem-solving and demonstration generation.

Tree-of-Thoughts (ToT) Prompting

Tree-of-Thoughts (ToT) Prompting is a technique that extends the Chain-of-Thought approach. It allows language models to explore coherent units of text ("thoughts") as intermediate steps towards problem-solving. ToT enables models to make deliberate decisions, consider multiple reasoning paths, and self-evaluate choices. It introduces a structured framework where models can look ahead or backtrack as needed during the reasoning process. ToT Prompting provides a more structured and dynamic approach to reasoning, allowing language models to navigate complex problems with greater flexibility and strategic decision-making. It is particularly beneficial for tasks that require comprehensive and adaptive reasoning capabilities.

Key Characteristics:

Coherent Units ("Thoughts"): ToT prompts LLMs to consider coherent units of text as intermediate reasoning steps.
Deliberate Decision-Making: Enables models to make decisions intentionally and evaluate different reasoning paths.
Backtracking and Looking Ahead: Allows models to backtrack or look ahead during the reasoning process, providing flexibility in problem-solving.

Untitled

                                        Image Source: [Yao et el. (2023)](<https://arxiv.org/abs/2305.10601>)

Graph of Thought Prompting

This work arises from the fact that human thought processes often follow non-linear patterns, deviating from simple sequential chains. In response, the authors propose Graph-of-Thought (GoT) reasoning, a novel approach that models thoughts not just as chains but as graphs, capturing the intricacies of non-sequential thinking.

This extension introduces a paradigm shift in representing thought units. Nodes in the graph symbolize these thought units, and edges depict connections, presenting a more realistic portrayal of the complexities inherent in human cognition. Unlike traditional trees, GoT employs Directed Acyclic Graphs (DAGs), allowing the modeling of paths that fork and converge. This divergence provides GoT with a significant advantage over conventional linear approaches.

The GoT reasoning model operates in a two-stage framework. Initially, it generates rationales, and subsequently, it produces the final answer. To facilitate this, the model leverages a Graph-of-Thoughts encoder for representation learning. The integration of GoT representations with the original input occurs through a gated fusion mechanism, enabling the model to combine both linear and non-linear aspects of thought processes.

Untitled

                               Image Source: [Yao et el. (2023)](<https://arxiv.org/abs/2305.16582>)

B. Comprehensive Reasoning and Verification

Comprehensive Reasoning and Verification methods in prompting entail a more sophisticated approach where reasoning is not just confined to providing a final answer but involves generating detailed intermediate steps. The distinctive aspect of these techniques is the integration of a self-verification mechanism within the framework. As the LLM generates intermediate answers or reasoning traces, it autonomously verifies their consistency and correctness. If the internal verification yields a false result, the model iteratively refines its responses, ensuring that the generated reasoning aligns with the expected logical coherence. These checks contributes to a more robust and reliable reasoning process, allowing the model to adapt and refine its outputs based on internal validation

Automatic Prompt Engineer

Automatic Prompt Engineer (APE) is a technique that treats instructions as programmable elements and seeks to optimize them by conducting a search across a pool of instruction candidates proposed by an LLM. Drawing inspiration from classical program synthesis and human prompt engineering, APE employs a scoring function to evaluate the effectiveness of candidate instructions. The selected instruction, determined by the highest score, is then utilized as the prompt for the LLM. This automated approach aims to enhance the efficiency of prompt generation, aligning with classical program synthesis principles and leveraging the knowledge embedded in large language models to improve overall performance in producing desired outputs.

Untitled

                                           Image Source: [Zhou et al., (2022)](<https://arxiv.org/abs/2211.01910>)

Chain of Verification (CoVe)

The Chain-of-Verification (CoVe) method addresses the challenge of hallucination in large language models by introducing a systematic verification process. It begins with the model drafting an initial response to a user query, potentially containing inaccuracies. CoVe then plans and poses independent verification questions, aiming to fact-check the initial response without bias. The model answers these questions, and based on the verification outcomes, generates a final response, incorporating corrections and improvements identified through the verification process. CoVe ensures unbiased verification, leading to enhanced factual accuracy in the final response, and contributes to improved overall model performance by mitigating the generation of inaccurate information.

Untitled

ETMI5: Explain to Me in 5

Introduction

Prompting

Prompt Engineering

Why Prompting?

Prompting Basics

Advanced Prompting Techniques

A. Step**-by-Step Modular Decomposition**

B. Comprehensive Reasoning and Verification

A. Step-by-Step Modular Decomposition