Feb 28, 2024

PML: Enhancing Python with Markup for LLM Integration

Python’s simplicity and readability have made it the language of choice for AI and data science. But as large language models (LLMs) become integral to application development, we’re finding that traditional Python syntax lacks built-in constructs for LLM interactions. At knify, we’re addressing this with PML (Python Markup Language) – an extension to Python that provides first-class support for working with LLMs.

The Need for Language Evolution

Programming languages evolve to meet new computational paradigms. Just as object-oriented programming introduced classes and inheritance, and async programming added coroutines and awaitable objects, the LLM era requires new language constructs to:

Express prompt engineering patterns directly in code
Manage context and conversation flow
Integrate deterministic logic with probabilistic LLM outputs
Provide type safety and schema validation for LLM responses

PML addresses these needs by extending Python with markup specifically designed for LLM interactions.

What is PML?

PML is a superset of Python that adds specialized syntax and semantics for LLM operations. It introduces markup elements that are processed by our compiler and highlighting system, making LLM interactions more natural, readable, and maintainable.

Let’s look at a simple example of traditional Python code that uses an LLM:

# Traditional Python with an LLM library
from llm_library import OpenAI

model = OpenAI(model="gpt-4")
system_prompt = """You are a helpful assistant that answers
questions about Python programming."""
user_query = "How do I use list comprehensions?"

response = model.generate(
    system=system_prompt,
    user=user_query,
    temperature=0.7
)

print(response)

Here’s the same functionality using PML:

# Python with PML extensions
import knify.pml

#<prompt name="python_assistant">
#| model: gpt-4
#| temperature: 0.7
You are a helpful assistant that answers
questions about Python programming.
#</prompt>

response = #<ask python_assistant>
How do I use list comprehensions?
#</ask>

print(response)

Notice how PML introduces dedicated syntax for prompts (#<prompt>) and queries (#<ask>), making the code more readable and the intent clearer.

Key Features of PML

1. Native Prompt Definitions

PML allows you to define prompts directly in your code with syntax highlighting and dedicated semantics:

#<prompt name="code_reviewer">
#| model: gpt-4
#| temperature: 0.2
You are a code reviewer who examines Python code and provides constructive feedback.
Focus on:
1. Performance improvements
2. Security considerations
3. Readability enhancements
#</prompt>

These prompts can be:

Reused throughout your codebase
Imported from other modules
Parameterized with variables
Versioned and tested like normal code

2. Schema-Aware LLM Calls

PML introduces syntax for structured LLM outputs with schema validation:

#<schema name="CodeReview">
class CodeReview:
    issues: list[str]
    improvements: list[str]
    overall_rating: int  # 1-10
#</schema>

review = #<ask code_reviewer returns="CodeReview">
def process_data(data):
    result = []
    for item in data:
        if item > 0:
            result.append(item * 2)
    return result
#</ask>

print(f"Overall rating: {review.overall_rating}/10")
for issue in review.issues:
    print(f"- {issue}")

The compiler ensures the LLM output is parsed and validated against the schema, providing type safety for otherwise unstructured text.

3. Context Management Directives

PML provides syntax for explicit context management:

#<context name="debugging_session">
# This context will be preserved across multiple LLM calls

with open("error_log.txt") as f:
    error_log = f.read()

#<include in="debugging_session">
Error log:
{error_log}
#</include>

solution = #<ask code_debugger with="debugging_session">
How do I fix this error?
#</ask>

follow_up = #<ask code_debugger with="debugging_session">
What caused this issue in the first place?
#</ask>
#</context>

This makes context management explicit and visible directly in the code, rather than hidden in library calls.

4. Multi-Turn Conversations

PML makes it natural to express multi-turn conversations with LLMs:

#<conversation name="math_tutor">
#| model: gpt-4
#| system: You are a helpful math tutor assisting a student.

#<user>
Can you explain how to solve quadratic equations?
#</user>

#<assistant>
Certainly! Quadratic equations are in the form ax² + bx + c = 0.
You can solve them using the quadratic formula:
x = (-b ± √(b² - 4ac)) / 2a

Would you like me to walk through an example?
#</assistant>

#<user>
Yes, please solve x² - 5x + 6 = 0
#</user>

#<assistant>
Let's identify the coefficients:
a = 1, b = -5, c = 6

Using the quadratic formula:
x = (5 ± √(25 - 24)) / 2
x = (5 ± √1) / 2
x = (5 ± 1) / 2

So x = 3 or x = 2

You can verify this works by substituting back:
3² - 5(3) + 6 = 9 - 15 + 6 = 0 ✓
2² - 5(2) + 6 = 4 - 10 + 6 = 0 ✓
#</assistant>
#</conversation>

This conversation can be used as a template, executed as-is, or extended with additional turns.

How PML Works: Compiler and Highlighting Support

PML isn’t just syntactic sugar—it’s processed by a sophisticated system:

The PML Compiler

Our compiler:

Parses Python code and identifies PML markup
Translates PML constructs into standard Python code
Generates optimized LLM interaction code
Adds runtime validation and error handling

For example, this PML:

result = #<ask math_solver>
Solve: 3x + 5 = 14
#</ask>

Compiles to something like:

from knify.pml.runtime import get_prompt, llm_query

_prompt = get_prompt("math_solver")
result = llm_query(
    prompt=_prompt,
    query="Solve: 3x + 5 = 14",
    model=_prompt.model,
    temperature=_prompt.temperature
)

IDE Integration and Syntax Highlighting

PML comes with IDE plugins that provide:

Syntax highlighting for PML markup
Auto-completion for prompt names and parameters
Documentation on hover
Error checking and linting
Navigation between prompt definitions and usages

Real-World Applications

PML is already being used in production environments for:

1. Sophisticated AI Assistants

#<prompt name="research_assistant">
#| model: gpt-4
#| temperature: 0.2
You are a research assistant helping with literature review.
#</prompt>

for paper in papers:
    #<context name="paper_analysis">
    #<include>
    Title: {paper.title}
    Authors: {paper.authors}
    Abstract: {paper.abstract}
    #</include>

    summary = #<ask research_assistant with="paper_analysis">
    Summarize the key findings and methodology of this paper.
    #</ask>

    paper.summary = summary
    #</context>

2. Code Generation and Transformation

#<prompt name="code_generator">
#| model: gpt-4
#| temperature: 0.1
You generate Python code based on specifications.
#</prompt>

class_definition = #<ask code_generator>
Create a Python class named 'DataProcessor' that:
1. Takes a file path in __init__
2. Has a method to read CSV or JSON data based on file extension
3. Has a method to filter rows by a column value
4. Has a method to compute summary statistics
#</ask>

# Write to file
with open("data_processor.py", "w") as f:
    f.write(class_definition)

3. Interactive Documentation

#<prompt name="doc_generator">
#| model: gpt-4
#| temperature: 0.3
You generate clear documentation with examples.
#</prompt>

def complex_function(x, y, z=None, **kwargs):
    """
    #<doc auto="doc_generator">
    This function performs a complex operation on inputs.

    Parameters:
    x (int): The primary input value
    y (str): A configuration string
    z (float, optional): An optional scaling factor
    **kwargs: Additional parameters

    Returns:
    dict: Result containing processed values
    #</doc>
    """
    # Implementation...

Getting Started with PML

PML is available as part of the knify framework. To start using it:

Install the knify package: pip install knify
Add IDE support with our VS Code or PyCharm extensions
Import the PML module: import knify.pml
Start using PML markup in your Python code

The compiler will automatically process your PML code, and you can seamlessly mix it with standard Python.

Future Directions

We’re continuously evolving PML based on real-world usage. Upcoming features include:

Function calling syntax for tool use
Fine-tuning directives
Cost and performance optimization hints
Distributed computation annotations

In our next article, we’ll explore how knify is revolutionizing database management with automatic entity generation and management, making database operations more intuitive than ever before.

Stay tuned for more insights into how knify is reshaping the modern web development landscape!