Optimisation Strategies
Practical techniques to reduce token usage, improve response quality, and minimise environmental impact whilst maintaining or improving output quality.
The Optimisation Hierarchy
Level 1: Query Planning
├─ Think before prompting
├─ Define clear objectives
└─ Batch related questions
Level 2: Prompt Engineering
├─ Use frameworks (AUTOMAT/CO-STAR)
├─ Specify output format upfront
└─ Add constraints to reduce iterations
Level 3: Caching & Reuse
├─ Save successful prompts as templates
├─ Cache outputs for incremental updates
└─ Build knowledge base
Level 4: Tool Selection
├─ Use NLMs for simple tasks
├─ Use smallest LLM that meets quality threshold
└─ Implement local models for repetitive work
Level 5: Advanced Techniques
├─ Fine-tuning for domain-specific tasks
├─ Quantization for faster inference
└─ RAG (Retrieval Augmented Generation)
Strategy 1: The One-Shot Principle
Goal: Get the right output on the first attempt.
The Problem: Iterative Refinement
Typical inefficient workflow:
[Attempt 1] "Summarise this paper"
→ Too generic
[Attempt 2] "Make it more technical"
→ Wrong focus
[Attempt 3] "Focus on synthesis methods"
→ Missing quantitative data
[Attempt 4] "Include temperatures and pressures"
→ Finally acceptable
Cost: 4× the tokens, time, and environmental impact
The Solution: Comprehensive Upfront Specification
[Single Attempt] Using AUTOMAT:
Act as a Materials Science Technical Reviewer.
Task: Summarise synthesis methodology from the attached paper.
Output: Structured summary with sections:
1. Synthesis method (brief description)
2. Key parameters (table: parameter, value, units)
3. Novel aspects (bullet points)
4. Reproducibility assessment (high/medium/low + justification)
Constraints:
- Extract only explicitly stated values (no estimation)
- Focus on sections 2-3 of paper (synthesis and characterisation)
- Ignore theoretical calculations in section 4
- Flag any ambiguous or missing critical parameters
Tone: Technical, objective, suitable for replication assessment
Result: First attempt yields production-ready output
Savings: Reduction in tokens and time
Strategy 2: Strategic Context Pruning
Goal: Include only relevant context, not everything.
The Problem: Context Overload
[Bloated Context]
"I'm a materials scientist at AmaDema, a company founded in 2018
by Dr. Smith and Dr. Jones, specialising in advanced ceramics and
polymer nanocomposites. We have 15 employees across R&D, production,
and sales. Our main products are electrospun nanofibers for
aerospace and biomedical applications. We're currently working on a
project funded by an EPSRC grant (£450k over 3 years) to develop
sustainable alternatives to PLA. My role is Senior Researcher
focusing on synthesis optimisation. I've been here for 2 years after
completing my PhD at Bristol on graphene functionalisation.
Currently, we're in month 14 of the project and need to demonstrate
proof-of-concept by month 18 for the next review.
Could you help me format this synthesis protocol?"
Problems: - 90% of context irrelevant to task - Increased token cost - Potential information leakage (company details)
The Solution: Task-Relevant Context Only
[Optimised Context]
"Act as a Laboratory Documentation Specialist.
Context: Converting informal lab notes to standardised synthesis
protocol format for materials science publication.
Task: Format the following lab notes according to standard
structure (Materials, Equipment, Procedure, Characterisation).
[LAB NOTES]"
Savings: Reduction in tokens, eliminated IP leakage risk
Strategy 3: Structured Output Specification
Goal: Reduce back-and-forth by defining exact format upfront.
The Problem: Format Mismatches
[Vague Request]
"Extract synthesis parameters from these 10 papers"
[Output]
Long prose paragraphs mixing papers, inconsistent parameter mentions,
difficult to compare...
[Follow-up]
"Can you put that in a table?"
[Follow-up 2]
"Can you add the DOI column?"
[Follow-up 3]
"Can you sort by temperature?"
Total: 4 queries to get usable output
The Solution: Specify Format Exactly
[Structured Request]
"Extract synthesis parameters from these 10 papers.
Output: Markdown table with columns:
- Paper (Author Year)
- DOI
- Polymer
- Nanofiller
- Nanofiller Loading (wt%)
- Synthesis Method
- Key Temperature (°C)
- Solvent System
Sort by: Temperature (ascending)
If parameter not reported: mark 'NR'
If ambiguous: mark 'Ambiguous [brief note]'"
Result: Single query yields ready-to-use table
Savings: Reduction in iterations
Strategy 4: Constraint Engineering
Goal: Prevent hallucinations and irrelevant content through explicit constraints.
Effective Constraints
Whitelisting (What to Include)
Constraints:
- Focus exclusively on peer-reviewed journal articles
- Only papers published 2020-2024
- Only electrospinning-based synthesis methods
- Must explicitly report mechanical properties
Blacklisting (What to Exclude)
Constraints:
- Exclude review papers and book chapters
- Exclude theoretical/computational studies
- Exclude patents and conference proceedings
- Ignore papers on melt-spinning or solution casting
Handling Uncertainty
Constraints:
- If data is missing, mark as "Not reported"—do not estimate
- If multiple values given, report as range
- If conflicting values across papers, flag discrepancy
- Do not extrapolate from related materials
Strategy 5: Prompt Chaining with Checkpoints
Goal: Break complex tasks into steps with validation.
When to Use
✅ Complex multi-step analysis
✅ Tasks requiring intermediate validation
✅ High-stakes outputs needing verification
Example: Patent Prior Art Search
Step 1: Broad Identification
Prompt 1: "Review these 50 patents and identify those claiming
electrospinning-based synthesis of polymer/graphene composites.
Output: List of patent numbers that meet criteria."
→ Human checkpoint: Verify list before proceeding
Step 2: Detailed Extraction
Prompt 2: "For the following 12 patents [list from Step 1], extract:
- Claims related to synthesis method
- Graphene content ranges
- Processing temperatures
- Novelty statements
Output: Comparative table"
→ Human checkpoint: Verify accuracy of extracted data
Step 3: Overlap Analysis
Prompt 3: "Compare these 12 patents against our method [description].
For each patent, assess overlap: High/Medium/Low/None. Justify
assessment."
→ Human checkpoint: Review before legal submission
Why better than single query: - Intermediate validation catches errors early - Each step can be cached and reused - Easier to debug if issues arise - Human expertise applied at critical junctions
Strategy 6: Templating for Recurring Tasks
Goal: Create reusable, proven prompts for common tasks.
Template Library
Template 1: Literature Summary
Act as a {DOMAIN} Specialist reviewing literature for {PURPOSE}.
Task: Summarise key findings from the following paper relevant to
{SPECIFIC_FOCUS}.
Paper: {TITLE}, {DOI}
Output format:
1. Main finding (1 sentence)
2. Methodology (brief, 2-3 sentences)
3. Key quantitative results (table if >3 values, else bullets)
4. Limitations/caveats (bullets)
5. Relevance to {SPECIFIC_FOCUS} (1-2 sentences)
Constraints:
- Extract only explicitly stated results
- If methodology unclear, note as limitation
- Do not compare to other papers
Tone: {TONE}
Usage: Replace {VARIABLES} each time, consistent quality
Template 2: Experimental Design Brainstorming
Act as a Senior {DOMAIN} Researcher with expertise in {SUBFIELD}.
Context: We are experiencing {PROBLEM_DESCRIPTION} in our current
{PROCESS}. Existing approaches have limitations: {LIMITATIONS}.
Task: Propose 5 alternative experimental approaches to address this
problem.
For each approach, provide:
1. Brief description (2-3 sentences)
2. Expected advantages (bullets)
3. Potential challenges (bullets)
4. Resource requirements (equipment, materials, time)
5. Risk level (High/Medium/Low)
Constraints:
- Focus on approaches feasible with {AVAILABLE_EQUIPMENT}
- Stay within {BUDGET_CONSTRAINT}
- Exclude approaches requiring >6 months development
- Prioritise approaches with literature precedent
Output: Numbered list, sorted by risk level (Low to High)
Building Your Template Library
- Identify recurring tasks in your workflow
- Perfect the prompt through iteration (once)
- Generalise by adding variables
{PLACEHOLDER} - Document the template with usage notes
- Share with team for collective efficiency
Strategy 7: Caching and Incremental Updates
Goal: Never regenerate what you've already processed.
Implementation
Month 1: Initial Build
Task: Comprehensive literature review of PLA/graphene electrospinning
(100 papers, 2015-2024)
Output: Detailed database with synthesis parameters, properties,
methods
[Save output as: PLA_Graphene_Database_2024-01.md]
Month 2: Incremental Update
Task: Update PLA/graphene database with papers published since
2024-01-31
New papers: [5 DOIs]
Instructions:
1. Process only the 5 new papers
2. Use same format as existing database
3. Flag any contradictions with existing entries
4. Output only the new entries (I will append manually)
Savings: Reduction in computational cost (5 vs 105 papers)
Caching Best Practices
✅ Use consistent formatting (easy to append)
✅ Include timestamps (track what's current)
✅ Version control (track changes over time)
✅ Tag by topic (reuse across projects)
✅ Document sources (verify when needed)
Strategy 8: Query Batching
Goal: Process multiple items in single query to reduce overhead.
Example: SEM Image Analysis
Inefficient (Individual Processing)
[Query 1] "Analyse SEM image 1: measure average fiber diameter"
[Query 2] "Analyse SEM image 2: measure average fiber diameter"
[Query 3] "Analyse SEM image 3: measure average fiber diameter"
...
[Query 10] "Analyse SEM image 10: measure average fiber diameter"
Cost: 10 queries, high overhead per query
Efficient (Batched Processing)
[Single Query] "Analyse the following 10 SEM images of electrospun
fibers. For each:
- Measure average fiber diameter (from 20 random fibers)
- Calculate standard deviation
- Assess fiber uniformity (CV%)
- Note any defects (beads, fusion points)
Output: Table with columns [Image ID, Avg Diameter (nm), SD, CV%,
Defects (Y/N), Notes]
Images: [Attach all 10]"
Cost: 1 query, 70% less overhead
Savings: ~40% total reduction in tokens
Strategy 9: Local Model Fine-Tuning
Goal: Train domain-specific models for highly repetitive tasks.
When to Consider
✅ Task performed >1000 times/year
✅ Task is highly standardised
✅ Quality threshold well-defined
✅ Training data available or easy to generate
Process
Step 1: Generate Training Data
Use large model (GPT-4) to create 500-1000 example inputs/outputs:
Prompt: "Generate 50 diverse examples of synthesis protocol
formatting tasks. For each:
- Input: Informal lab notes (varied styles)
- Output: Standardised ISO format protocol
Ensure diversity in: polymer types, synthesis methods, level of
detail in notes"
Cost: High upfront (500-1000 examples)
Step 2: Fine-Tune Small Model
Use examples to fine-tune Llama 8B on this specific task:
# Technical implementation (simplified)
python fine_tune.py \
--model llama-3.3-8b \
--task protocol_formatting \
--examples training_data.json \
--epochs 3
Time: ~2-4 hours on GPU
Step 3: Deploy
Use fine-tuned model for all future protocol formatting:
Per-query cost: - Original (GPT-4): ~\(0.03, ~50g CO₂ - Fine-tuned (Llama 8B local): ~\)0.001, ~2g CO₂
Breakeven: After ~1000 uses (typical: 3-6 months)
Annual savings (2000 uses): - $56 cost savings - 96 kg CO₂ reduction
Strategy 10: Retrieval-Augmented Generation (RAG)
Goal: Give AI access to your specific documents without retraining.
What is RAG?
Traditional approach: AI relies only on training data
RAG approach: AI retrieves relevant chunks from your documents, then generates response
Materials Science Use Case
Scenario: You have 500 internal synthesis protocols, want AI to answer questions about them
Without RAG (Inefficient)
Option A: Include all 500 protocols in prompt
Problem: Exceeds context window, massive token cost
Option B: Manually find relevant protocols, include in prompt
Problem: Time-consuming, error-prone
With RAG (Efficient)
[Setup once]
1. Upload 500 protocols to vector database
2. Each protocol is embedded (numerical representation)
[Each query]
1. Your question is embedded
2. System finds 3-5 most relevant protocols automatically
3. Only those protocols included in prompt to AI
4. AI generates response using only relevant context
Benefits: - Automatic retrieval (no manual searching) - Token-efficient (only relevant docs) - Always up-to-date (add new protocols anytime) - Works with unlimited document libraries
Simple RAG Implementation
Tools: - LlamaIndex or LangChain (Python frameworks) - ChromaDB or FAISS (vector databases) - Ollama (local LLM)
Implementation time: 2-4 hours initial setup
Payoff: Ongoing efficiency for document-heavy workflows
Real-World Optimization: Before & After
Case Study: Quarterly Competitive Analysis
Before Training
Task: Analyse 50 competitor patents for quarterly R&D strategy meeting
Approach: 1. Read each patent manually 2. Ask AI for clarification on unclear sections (ad hoc queries) 3. Summarise findings in Word document 4. Ask AI to polish document
Time: 20 hours
Queries: ~100
Tokens: ~50,000
Environmental cost: 2.5 L water, 250g CO₂
After Training
Approach:
Step 1: Batch classification (NLM)
Use BERT to classify 50 patents by technology area
→ Result: 12 relevant to our focus areas
→ 5 minutes, negligible environmental cost
Step 2: Structured extraction (AUTOMAT + LLM)
Single batched query extracting synthesis methods, claims, novelty
from 12 relevant patents
→ Result: Structured comparison table
→ 1 query, ~15 minutes
Step 3: Strategic analysis (CO-STAR + LLM)
Single query generating competitive gap analysis and recommendations
based on extracted data
→ Result: Executive summary with strategic insights
→ 1 query, ~10 minutes
Step 4: Cache for next quarter
Time: 4 hours (80% reduction)
Queries: ~5 (95% reduction)
Tokens: ~5,000 (90% reduction)
Environmental cost: 0.25 L water, 25g CO₂ (90% reduction)
Quality: Higher (more structured, comprehensive)
Optimization Checklist
Before submitting any AI query, verify:
- Objective clear? (What exactly do I need?)
- Output format specified? (Table, bullets, code, etc.)
- Constraints defined? (What to include/exclude?)
- Could I batch this? (Process multiple items together?)
- Can I reuse existing output? (Check cache first?)
- Right model size? (Smallest that meets quality threshold?)
- Could NLM handle this? (Simple extraction/classification?)
- Have I used a template? (For recurring tasks?)
Goal: Answer "yes" to ≥6 of these before hitting enter
Exercise: Optimise Your Own Workflow
Challenge
Step 1: Identify one recurring AI task in your work
Step 2: Document your current approach: - How many queries per instance? - How long does it take? - What's the output quality?
Step 3: Apply 3 optimisation strategies from this page
Step 4: Test optimised approach in sandbox
Step 5: Calculate improvements: - % reduction in queries - % reduction in time - % reduction in tokens - Change in quality (better/same/worse)
Share your results with workshop group
Next: Advanced Conversational Learning →