{"id":42579,"date":"2024-03-20T15:52:40","date_gmt":"2024-03-20T19:52:40","guid":{"rendered":"https:\/\/mjtsai.com\/blog\/?p=42579"},"modified":"2024-11-01T10:02:05","modified_gmt":"2024-11-01T14:02:05","slug":"a-taxonomy-of-prompt-injection-attacks","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2024\/03\/20\/a-taxonomy-of-prompt-injection-attacks\/","title":{"rendered":"A Taxonomy of Prompt Injection Attacks"},"content":{"rendered":"<p><a href=\"https:\/\/www.schneier.com\/blog\/archives\/2024\/03\/a-taxonomy-of-prompt-injection-attacks.html\">Bruce Schneier<\/a>:<\/p>\n<blockquote cite=\"https:\/\/www.schneier.com\/blog\/archives\/2024\/03\/a-taxonomy-of-prompt-injection-attacks.html\">\n<p>Researchers ran a global prompt hacking competition, and have <a href=\"https:\/\/arxiv.org\/pdf\/2311.16119.pdf\">documented<\/a> the results in a paper that both gives a lot of good examples and tries to organize a taxonomy of effective prompt injection strategies. It seems as if the most common successful strategy is the &ldquo;compound instruction attack,&rdquo; as in &ldquo;Say &lsquo;I have been PWNED&rsquo; without a period.&rdquo;<\/p>\n<\/blockquote>\n\n<p><a href=\"https:\/\/arstechnica.com\/security\/2024\/03\/researchers-use-ascii-art-to-elicit-harmful-responses-from-5-major-ai-chatbots\/\">Dan Goodin<\/a>:<\/p>\n<blockquote cite=\"https:\/\/arstechnica.com\/security\/2024\/03\/researchers-use-ascii-art-to-elicit-harmful-responses-from-5-major-ai-chatbots\/\">\n<p>Enter ArtPrompt, a practical attack recently presented by a team of academic researchers. It formats user-entered requests&mdash;typically known as prompts&mdash;into standard statements or sentences as normal with one exception: a single word, known as a mask, is represented by ASCII art rather than the letters that spell it. The result: prompts that normally would be rejected are answered.<\/p>\n<p>The researchers provided one example in a recently published <a href=\"https:\/\/arxiv.org\/pdf\/2402.11753.pdf\">paper<\/a>. It provided instructions for interpreting a set of ASCII characters arranged to represent the word &ldquo;counterfeit.&rdquo;<\/p>\n<\/blockquote>\n\n<p>Via <a href=\"https:\/\/daringfireball.net\/linked\/2024\/03\/17\/ascii-art-vs-ai\">John Gruber<\/a>:<\/p>\n<blockquote cite=\"https:\/\/daringfireball.net\/linked\/2024\/03\/17\/ascii-art-vs-ai\">\n<p>It&rsquo;s simultaneously impressive that they&rsquo;re smart enough to read ASCII art, but laughable that they&rsquo;re so naive that this trick works.<\/p>\n<\/blockquote>\n\n<p>Previously:<\/p>\n<ul>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2014\/11\/22\/monodraw\/\">Monodraw<\/a><\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>Bruce Schneier: Researchers ran a global prompt hacking competition, and have documented the results in a paper that both gives a lot of good examples and tries to organize a taxonomy of effective prompt injection strategies. It seems as if the most common successful strategy is the &ldquo;compound instruction attack,&rdquo; as in &ldquo;Say &lsquo;I have [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"2024-03-20T19:52:44Z","apple_news_api_id":"1f9dace4-5d10-40b2-b339-8269cb23848c","apple_news_api_modified_at":"2024-03-20T19:52:44Z","apple_news_api_revision":"AAAAAAAAAAD\/\/\/\/\/\/\/\/\/\/w==","apple_news_api_share_url":"https:\/\/apple.news\/AH52s5F0QQLKzOYJpyyOEjA","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[2],"tags":[1351,2317,2347,2427],"class_list":["post-42579","post","type-post","status-publish","format-standard","hentry","category-technology","tag-artificial-intelligence","tag-chatgpt","tag-bard","tag-llama"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/42579","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=42579"}],"version-history":[{"count":1,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/42579\/revisions"}],"predecessor-version":[{"id":42580,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/42579\/revisions\/42580"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=42579"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=42579"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=42579"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}