prompt工程简介

ChatGPT火了。

像是一个不属于这个时代的产品。

它的功能强大,可以流利地回答你提出的任何问题,而且看起来很像是那么一回事(当然具体的回答结果仍依赖提问者的判断);可以帮助你润色邮件和文章;甚至可以帮你总结一段话的主旨。

总之,这是一个十分强大的工具。

本着实用主义的态度,对于一个新的工具,最重要的就是掌握它的使用技巧。

提示工程(Prompt engineering)应运而生。

1. 背景

按照维基百科的解释,

Prompt engineering or prompting is the process of structuring sentences so that they can be interpreted and understood by a generative AI model in such a way that its output is in accord with the user's intentions.

意思是,提示工程是构建句子的过程,以便生成式AI模型的解释和理解,使其输出更符合用户意图。

也就是说,如果想要得到准确的结果,写好prompt非常重要。

提示工程指南里也提到,

掌握了提示工程相关技能将有助于用户更好地了解大型语言模型的能力和局限性。

2. 学习

有个ChatGPT Prompt Engineering for Developers,是吴恩达与OpenAI联合推出的一个入门级课程。一两个小时就能看完,B站上也有搬运。

如果想要更深入了解内部细节,提示工程指南这本开源的书是个挺不错的出发点,里面提到了非常多的论文和工具。有兴趣都可以尝试。

应该说,提示工程是个实践性比较强的领域,make your hands dirty比只是单纯看视频或者学习理论知识重要的多。

3. 概要

课程总共有9集。整体思路非常清楚,涉及提示工程的核心原则,常见的处理任务和简单技巧。

我之前以为提示工程就是把问题描述清楚,然后丢给GPT,等着回答就好。后来发现其实不然,这里面有很多技巧,我们可以引导,甚至干预GPT回答的内容和方式。

3.1 原则

核心原则有两个,

  • Write clear and specific instructions
  • Give the model time to think

简单来说,就是要提供清晰和具体的指令,同时给模型时间“思考”。

3.1.1 原则一

如何写出清晰且具体的指令呢?这里有一些简单的技巧。

  1. 使用分隔符清楚地指示输入的不同部分,分隔符是以下任意一个: ```, """, < >,

  2. 指定结构化的输出,比如可以输出为HTML或者JSON格式

    prompt = f"""
    Generate a list of three made-up book titles along \ 
    with their authors and genres. 
    Provide them in JSON format with the following keys: 
    book_id, title, author, genre.
    """
    response = get_completion(prompt)
    print(response)
    
  3. 让模型检查是否满足条件(如果...就...)

    text_1 = f"""
    Making a cup of tea is easy! First, you need to get some \ 
    water boiling. While that's happening, \ 
    grab a cup and put a tea bag in it. Once the water is \ 
    hot enough, just pour it over the tea bag. \ 
    Let it sit for a bit so the tea can steep. After a \ 
    few minutes, take out the tea bag. If you \ 
    like, you can add some sugar or milk to taste. \ 
    And that's it! You've got yourself a delicious \ 
    cup of tea to enjoy.
    """
    prompt = f"""
    You will be provided with text delimited by triple quotes. 
    If it contains a sequence of instructions, \ 
    re-write those instructions in the following format:
    
    Step 1 - ...
    Step 2 - …
    …
    Step N - …
    
    If the text does not contain a sequence of instructions, \ 
    then simply write \"No steps provided.\"
    
    \"\"\"{text_1}\"\"\"
    """
    response = get_completion(prompt)
    print("Completion for Text 1:")
    print(response)
    
  4. 少样本提示(通过提供一个或多个样本提示,让模型更加清楚你的意图)

    prompt = f"""
    Your task is to answer in a consistent style.
    
    <child>: Teach me about patience.
    
    <grandparent>: The river that carves the deepest \ 
    valley flows from a modest spring; the \ 
    grandest symphony originates from a single note; \ 
    the most intricate tapestry begins with a solitary thread.
    
    <child>: Teach me about resilience.
    """
    response = get_completion(prompt)
    print(response)
    

总的来说,原则一的核心,就是能结构化的表达清楚提问者的意图,同时给模型提供足够多的背景信息,有效信息越多,结果越准确。

就好比,如果你自己都没有想清楚问题是什么,更不要说获得一个好的答案了。

因此,提示工程的第一步,是思考清楚问题,技巧什么反倒在其次。

3.1.2 原则二

第二个原则看上去有些令人困惑。

我本来以为是模型计算耗时比较大,要多点耐心。

其实完全不是这个意思。

这里说的是思维链(Chain-of-Thought):将复杂任务拆分成简单的几个步骤,逐步引导模型逼近最终答案。Chain-of-Thought Prompting Elicits Reasoning in Large Language Models这篇论文讲得更清楚。

  1. 指定完成任务所需的步骤(有点像是workflow)

    text = f"""
    Your task is to perform the following actions: 
    1 - Summarize the following text delimited by 
      <> with 1 sentence.
    2 - Translate the summary into French.
    3 - List each name in the French summary.
    4 - Output a json object that contains the 
      following keys: french_summary, num_names.
    
    Use the following format:
    Text: <text to summarize>
    Summary: <summary>
    Translation: <summary translation>
    Names: <list of names in Italian summary>
    Output JSON: <json with summary and num_names>
    
    Text: <{text}>
    """
    response = get_completion(prompt_2)
    print("\nCompletion for prompt 2:")
    print(response)
    response = get_completion(prompt_1)
    print("Completion for prompt 1:")
    print(response)
    
  2. 在得出答案前,让模型自己找到解决方案

    prompt = f"""
    Your task is to determine if the student's solution \
    is correct or not.
    To solve the problem do the following:
    - First, work out your own solution to the problem. 
    - Then compare your solution to the student's solution \ 
    and evaluate if the student's solution is correct or not. 
    Don't decide if the student's solution is correct until 
    you have done the problem yourself.
    
    Use the following format:
    Question:
    ```
    question here
    ```
    Student's solution:
    ```
    student's solution here
    ```
    Actual solution:
    ```
    steps to work out the solution and your solution here
    ```
    Is the student's solution the same as actual solution \
    just calculated:
    ```
    yes or no
    ```
    Student grade:
    ```
    correct or incorrect
    ```
    
    Question:
    ```
    I'm building a solar power installation and I need help \
    working out the financials. 
    - Land costs $100 / square foot
    - I can buy solar panels for $250 / square foot
    - I negotiated a contract for maintenance that will cost \
    me a flat $100k per year, and an additional $10 / square \
    foot
    What is the total cost for the first year of operations \
    as a function of the number of square feet.
    ``` 
    Student's solution:
    ```
    Let x be the size of the installation in square feet.
    Costs:
    1. Land cost: 100x
    2. Solar panel cost: 250x
    3. Maintenance cost: 100,000 + 100x
    Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
    ```
    Actual solution:
    """
    response = get_completion(prompt)
    print(response)
    
3.1.3 模型幻觉(Hallucinations)

这一点虽然不是核心原则,但解释了大语言模型的局限性。并非给出的所有答案都是可信的。

常用GPT的人应该都遇到过,有时候它给出的答案看似正确,实则是“一本正经地胡说八道”。

个人在使用中的一个技巧是,把GPT当作是一种思维发散的工具,发散后借助搜索引擎,将其中的有效信息进行收敛,进而提供给GPT更多信息,最后逐步得到答案。

这其实也是下一个部分想要传达的。

3.2 迭代开发(Iterative Prompt Develelopment)

这个过程,十分像是敏捷开发,程序员应该都比较熟悉。

由于我们很难一次就把prompt写得很好,但是我们可以根据反馈迭代:

  • 写一个prompt,得到回答
  • 检查回答是否满意,不满意的话,修正prompt
  • 继续得到答案,循环往复

很粗鲁,但很实用。

3.3 常见任务:总结、推理、转换和扩展

ChatGPT的使用场景广泛,这里只举了四个典型场景:总结;推理;转换;扩展。

3.3.1 总结(Summarizing)

面对一大段长文本,我们可以借助LLM的总结(summarize)能力快速获取有效信息。

prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site. 

Summarize the review below, delimited by triple 
backticks, in at most 30 words. 

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)
3.3.2 推理(Inferring)

推理是对文本的另一种分析,例如情感分析。

传统的NLP也可以做到,但是使用LLM无疑更加简单直接。写个Prompt即可。

prompt = f"""
What is the sentiment of the following product review, 
which is delimited with triple backticks?

Review text: ```{lamp_review}```
"""
response = get_completion(prompt)
print(response)

可以想见,未来这个能力在数据挖掘领域应该会有广泛应用。

3.3.3 转换(Transforming)

大语言模型也非常擅长将输入转换为不同的格式,比如翻译,比如拼写纠错等,甚至格式的转换,将HTML转换为JSON。

上述功能,常规的机器学习也能做到,但LLM可以提供诸如语气转换这样的能力。

prompt = f"""
proofread and correct this review. Make it more compelling. 
Ensure it follows APA style guide and targets an advanced reader. 
Output in markdown format.
Text: ```{text}```
"""
response = get_completion(prompt)
display(Markdown(response))

甚至可以将多个功能串联在一起。

3.3.4 扩展(Expanding)

扩展是将短文本输入到大语言模型中,让模型生成更长的文本。比如,生成基于某个主题的电子邮件,甚至写一篇论文。但目前科学(Science)杂志并不允许将ChatGPT列为作者

prompt = f"""
You are a customer service AI assistant.
Your task is to send an email reply to a valued customer.
Given the customer email delimited by ```, \
Generate a reply to thank the customer for their review.
If the sentiment is positive or neutral, thank them for \
their review.
If the sentiment is negative, apologize and suggest that \
they can reach out to customer service. 
Make sure to use specific details from the review.
Write in a concise and professional tone.
Sign the email as `AI customer agent`.
Customer review: ```{review}```
Review sentiment: {sentiment}
"""
response = get_completion(prompt)
print(response)

这里捎带着提一句,大语言模型会有个“温度(temperature)”参数,是用来描述响应结果的不确定性。

这个参数很有意思,众所周知,温度越高,熵越大,不确定性也就越大。

通常如果为了得到一个相对确定的结果,这个参数可以设置为0

4. 总结

ChatGPT是一个很强大的工具。

这种强大,不仅在于它所能处理的任务,几乎无所不包,也在于它对于问题的理解,信息的处理和组织,某种程度上已经超越了人的想象。

或许在不久的未来,“图灵测试”很快就能通过吧。

当然,大语言模型的背后,离不开计算机科学的理论创新,半导体行业的”摩尔定律“,日益增长的数据采集和存储。

此刻我们需要的,就是学习一些简单技巧,更好地使用这些工具,来让我们的生活更加美好。

毕竟,大语言模型也在不断进化着。

(完)

参考

  1. https://zh.wikipedia.org/zh-hans/%E6%8F%90%E7%A4%BA%E5%B7%A5%E7%A8%8B
  2. https://www.promptingguide.ai/zh
  3. https://zhuanlan.zhihu.com/p/625917566
  4. datawhalechina
  5. prompt-engineering-for-developers
在 GitHub 上编辑本页面 更新时间: 10/13/2023, 12:55:21 AM