Prompt Engineering Cheat Sheet

Quick reference guide for crafting effective prompts.

AUTOMAT Framework

Component	What to Include	Example
Audience	Who will use this output?	R&D team, IP legal, management
User Persona	AI's role/expertise	"Act as Senior Polymer Chemist"
Task	Specific action required	Extract synthesis parameters, format protocol
Output	Format and structure	Markdown table, JSON, bullet points
Method	Approach/methodology	ISO standards, systematic review
Assumptions	Constraints, boundaries	Exclude pre-2020, focus on non-oxide ceramics
Tone	Voice and style	Technical, formal, conversational

Best for: Functional, structured tasks with clear outputs

CO-STAR Framework

Component	What to Include	Example
Context	Background and situation	Market position, project goals, constraints
Objective	Goal to achieve	Secure funding, convince stakeholders
Style	Writing approach	Academic, journalistic, executive
Tone	Emotional quality	Confident, cautious, inspirational
Audience	Who will read this	VC investors, peer reviewers, management
Response	Output format/structure	2-page memo, 5 sections, bullet points

Best for: Narrative, strategic documents requiring rich context

Quick Decision: Which Framework?

Is your task...

FUNCTIONAL & STRUCTURED?          |  NARRATIVE & STRATEGIC?
(data extraction, formatting,     |  (reports, summaries, pitches,
code generation, classification)  |  strategic analysis, proposals)
           ↓                       |              ↓
     USE AUTOMAT                  |        USE CO-STAR

🚫 Unpublished Research

Novel molecular structures
Exact synthesis parameters
Experimental results (ongoing)
Failed experiments (negative data)
Grant applications under review

🚫 Commercial Sensitive

Exact formulations and ratios
Proprietary process conditions
Yield data revealing efficiency
Cost breakdowns
Customer/partner identities
Pricing strategies

🚫 Personal & Confidential

Employee information
Customer data
Internal communications with strategy
Financial data
Legal documents

🚫 Security Sensitive

Access credentials
System configurations
Security protocols
Vulnerability assessments

Solution for sensitive work: Use local sandbox (Ollama + Llama)

Hallucination Prevention

Techniques

Explicit constraints: "If data missing, mark 'Not reported'—do not estimate"
Citation requirements: "Cite with DOI for every claim"
Range specification: "Focus only on papers 2020-2024"
Verification instruction: "Flag any uncertainty in your response"

Verification Checklist

Citations are real (verify DOIs)
Numerical values are plausible
Claims align with domain knowledge
No internal contradictions
Sources match claims

Optimization Quick Wins

1. Batch Queries

❌ Bad: 10 separate queries for 10 papers
✅ Good: 1 query processing all 10 papers

Savings: ~70%

2. Think Before Prompting

❌ Bad: Stream of consciousness, multiple refinement rounds
✅ Good: Plan query with framework, get it right first time

Savings: ~75%

3. Cache & Reuse

❌ Bad: Regenerate monthly literature review from scratch
✅ Good: Cache Month 1, only process new papers in Month 2+

Savings: ~90%

4. Choose Right Model

❌ Bad: GPT-4 for simple keyword extraction
✅ Good: BERT-based NLM for classification/extraction, LLM for reasoning

Savings: ~95% for appropriate tasks

5. Use Templates

❌ Bad: Recreate prompt for each similar task
✅ Good: Template with {VARIABLES}, fill in each time

Savings: ~80% time and consistency improvement

Model Selection Guide

Task Type	Best Tool	Example
Keyword extraction	NLM (BERT)	Find all papers mentioning "electrospinning"
Document classification	NLM (BERT)	Categorize 200 patents by technology
Simple formatting	Small LLM (8B)	Convert lab notes to template
Literature synthesis	Medium LLM (70B)	Summarize 20 papers with trend analysis
Strategic analysis	Large LLM (GPT-4)	Competitive gap analysis and recommendations
Code generation	Medium/Large LLM	Python script for data analysis

Common Prompt Mistakes

❌ Mistake 1: Vague Task

Bad: "Summarise this paper"
Good: "Extract synthesis methodology with parameter table: temp, pressure, yield"

❌ Mistake 2: No Output Format

Bad: "Extract data from these papers"
Good: "Markdown table with columns: [Author, Year, Method, Key Finding]"

❌ Mistake 3: Missing Constraints

Bad: "What's the melting point of PLA?"
Good: "If melting point reported in paper, extract with page #. If not reported, state 'Not reported'—do not use external data"

❌ Mistake 4: Context Overload

Bad: [500 words of company history for simple task]
Good: [Only task-relevant context, <100 words]

❌ Mistake 5: Wrong Tool

Bad: Using GPT-4 for keyword filtering
Good: Use BERT for extraction, GPT-4 for reasoning

Environmental Impact Reference

Action	Tokens	CO₂ (g)	Water (mL)
Simple query	~100	0.5	10
Complex prompt	~500	2.5	50
Document processing	~2000	10	200
Inefficient workflow (10+ queries)	~10,000	50	1000
Optimized workflow (1-2 queries)	~1,000	5	100

Your goal: >70% reduction through optimization

Emergency Contacts

For technical issues with sandbox: - Check Docker Desktop is running - Restart containers: docker-compose restart - View logs: docker-compose logs

For content questions: - Review relevant section in course materials - Ask in workshop Slack channel - Contact Avgi Stavrou

For data security concerns: - Stop immediately - Report to supervisor - Follow Red List protocol

Quick Links

💾 Download: PDF version of cheat sheet (by Maximiliian Vogel)