The Principal Dev – Masterclass for Tech Leads

The Principal Dev – Masterclass for Tech Leads28-29 May

Join

About This Monorepo

This monorepo contains a set of dev-friendly, framework agnostic components offering 3 main capabilities:

Read More

Overview of Capabilities

The library offers a set of small, focused building blocks.

Structured output

Purpose: turn messy model output into typed PHP data. Benefit: you stop hand-parsing JSON or text before using LLM results.

use Cognesy\Instructor\StructuredOutput;

final class Person {
    public string $name;
    public int $age;
}

$person = StructuredOutput::using('openai')
    ->with(messages: 'Jason is 28 years old.', responseModel: Person::class)
    ->get();

Detailed docs: packages/instructor/docs/

Unified inference

Purpose: call different LLM providers through one API. Benefit: switch providers without rewriting request code.

use Cognesy\Polyglot\Inference\Inference;

$text = Inference::using('openai')
    ->withMessages('Say hello in one sentence.')
    ->get();

Detailed docs: packages/polyglot/docs/

Embeddings

Purpose: generate vectors through the same provider layer. Benefit: keep retrieval and inference in one stack.

use Cognesy\Polyglot\Embeddings\Embeddings;

$vectors = Embeddings::using('openai')
    ->withInputs(['hello world'])
    ->vectors();

Detailed docs: packages/polyglot/docs/

Agents SDK

Purpose: build tool-using agents as a simple loop over state. Benefit: add tools and control flow without inventing your own agent runtime first.

use Cognesy\Agents\AgentLoop;
use Cognesy\Agents\Data\AgentState;

$result = AgentLoop::default()->execute(
    AgentState::empty()->withUserMessage('What is 2+2?')
);

Detailed docs: packages/agents/docs/

Code agent bridges

Purpose: drive external coding agents like Codex, Claude Code, and OpenCode from PHP. Benefit: automate reviews, summaries, and coding workflows through one interface.

use Cognesy\AgentCtrl\AgentCtrl;

$response = AgentCtrl::codex()->execute('Summarize this repository.');

Detailed docs: packages/agent-ctrl/docs/

What is Instructor?

Instructor is a library that allows you to extract structured, validated data from multiple types of inputs: text, images or OpenAI style chat sequence arrays. It is powered by Large Language Models (LLMs).

Instructor simplifies LLM integration in PHP projects. It handles the complexity of extracting structured data from LLM outputs, so you can focus on building your application logic and iterate faster.

Instructor for PHP is inspired by the Instructor library for Python created by Jason Liu.

image

Here's a simple CLI demo app using Instructor to extract structured data from text:

image

How Instructor Enhances Your Workflow

Instructor introduces three key enhancements compared to direct API usage.

Response Model

Specify a PHP class to extract data into via the 'magic' of LLM chat completion. And that's it.

Instructor reduces brittleness of the code extracting the information from textual data by leveraging structured LLM responses.

Instructor helps you write simpler, easier to understand code: you no longer have to define lengthy function call definitions or write code for assigning returned JSON into target data objects.

Validation

Response model generated by LLM can be automatically validated, following set of rules. Currently, Instructor supports only Symfony validation.

You can also provide a context object to use enhanced validator capabilities.

Max Retries

You can set the number of retry attempts for requests.

Instructor will repeat requests in case of validation or deserialization error up to the specified number of times, trying to get a valid response from LLM.

Support for LLM Providers

Instructor offers out-of-the-box support for the following LLM providers:

For usage examples, check Hub section or examples directory in the code repository.

Usage

Installation

You can install Instructor via Composer:

composer require cognesy/instructor-php

Basic Example

This is a simple example demonstrating how Instructor retrieves structured information from provided text (or chat message sequence).

Response model class is a plain PHP class with typehints specifying the types of fields of the object.

use Cognesy\Instructor\StructuredOutput;

// Step 0: Create .env file in your project root:
// OPENAI_API_KEY=your_api_key

// Step 1: Define target data structure(s)
class Person {
    public string $name;
    public int $age;
}

// Step 2: Provide content to process
$text = "His name is Jason and he is 28 years old.";

// Step 3: Use Instructor to run LLM inference
$person = (new StructuredOutput)
    ->withResponseClass(Person::class)
    ->withMessages($text)
    ->get();

// Step 4: Work with structured response data
assert($person instanceof Person); // true
assert($person->name === 'Jason'); // true
assert($person->age === 28); // true

echo $person->name; // Jason
echo $person->age; // 28

var_dump($person);
// Person {
//     name: "Jason",
//     age: 28
// }    

NOTE: Instructor supports classes / objects as response models. In case you want to extract simple types or enums, you need to wrap them in Scalar adapter - see section below: Extracting Scalar Values.

Validation

Instructor validates results of LLM response against validation rules specified in your data model.

For further details on available validation rules, check Symfony Validation constraints.

use Cognesy\Instructor\StructuredOutput;
use Symfony\Component\Validator\Constraints as Assert;

class Person {
    public string $name;
    #[Assert\PositiveOrZero]
    public int $age;
}

$text = "His name is Jason, he is -28 years old.";
$person = (new StructuredOutput)
    ->withResponseClass(Person::class)
    ->with(
        messages: [['role' => 'user', 'content' => $text]],
    )
    ->get();

// if the resulting object does not validate, Instructor throws an exception

Max Retries

In case maxRetries parameter is provided and LLM response does not meet validation criteria, Instructor will make subsequent inference attempts until results meet the requirements or maxRetries is reached.

Instructor uses validation errors to inform LLM on the problems identified in the response, so that LLM can try self-correcting in the next attempt.

use Cognesy\Instructor\StructuredOutput;
use Cognesy\Instructor\StructuredOutputRuntime;
use Cognesy\Polyglot\Inference\LLMProvider;
use Symfony\Component\Validator\Constraints as Assert;

class Person {
    #[Assert\Length(min: 3)]
    public string $name;
    #[Assert\PositiveOrZero]
    public int $age;
}

$text = "His name is JX, aka Jason, he is -28 years old.";
$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
    ->withMaxRetries(3);

$person = (new StructuredOutput($runtime))
    ->with(
        messages: [['role' => 'user', 'content' => $text]],
        responseModel: Person::class,
    )
    ->get();

// if all LLM's attempts to self-correct the results fail, Instructor throws an exception

Output Modes

Instructor supports multiple output modes through Cognesy\Instructor\Enums\OutputMode to allow working with various models depending on their capabilities.

Additionally, you can use OutputMode::Text to get LLM to generate text output without any structured data extraction.

Unified LLM API

Instructor ecosystem uses Polyglot as an unified inference API layer supporting 20+ LLM providers.

Polyglot takes care of translation of familiar OpenAI chat completion API conventions into LLM provider specific idioms / APIs, so you can easily switch between LLM providers without rewriting your LLM connectivity code.

Example (using sync API)

use Cognesy\Polyglot\Inference\Inference;

$answer = Inference::using('openai') // specify LLM connection preset (defined in config)
    ->with(messages: 'What is capital of Germany')
    ->get();

echo $answer;

Example (using streaming API)

use Cognesy\Polyglot\Inference\Inference;

$stream = Inference::using('anthropic') // specify LLM connection preset (defined in config)
    ->withMessages([['role' => 'user', 'content' => 'Describe capital of Brasil']])
    ->withOptions(['max_tokens' => 256])
    ->withStreaming()
    ->stream()
    ->deltas();

foreach ($stream as $delta) {
    echo $delta->contentDelta;
}

Example (customize LLM connection)

use Cognesy\Polyglot\Inference\Config\LLMConfig;
use Cognesy\Polyglot\Inference\Inference;

$answer = Inference::fromConfig(LLMConfig::fromArray([
    'driver' => 'deepseek',
    'apiUrl' => 'https://api.deepseek.com',
    'endpoint' => '/chat/completions',
    'model' => 'deepseek-chat',
]))
    ->withMessages([['role' => 'user', 'content' => 'What is the capital of France']])
    ->withOptions(['max_tokens' => 64])
    ->get();

echo $answer;

Documentation

Check out the documentation website for more details and examples of how to use Instructor for PHP.

Feature Highlights

Core features

Various extraction modes

Flexible inputs

Customization

Sync and streaming support

Observability

Support for multiple LLMs / API providers

Other capabilities

Documentation and examples

Instructor in Other Languages

Check out implementations in other languages below:

If you want to port Instructor to another language, please reach out to us on Twitter we'd love to help you get started!

Instructor Packages

This repository is a monorepo containing all Instructor's components (required and optional). It hosts all that you need to work with LLMs via Instructor.

Individual components are also distributed as standalone packages that can be used independently.

image

Links to read-only repositories of the standalone package distributions:

NOTE: If you are just starting to use Instructor, I recommend using the instructor-php package. It contains all the required components and is the easiest way to get started with the library.

License

This project is licensed under the terms of the MIT License.

Support

If you have any questions or need help, please reach out to me on Twitter or GitHub.

Contributing

If you want to help, check out some of the issues. All contributions are welcome - code improvements, documentation, bug reports, blog posts / articles, or new cookbooks and application examples.

Contributors

Contributors

Join libs.tech

...and unlock some superpowers

GitHub

We won't share your data with anyone else.