Image Pipeline

Capybara Packaged

In this part of the course we’ll take the image pipeline from this course: Image Generation - Pipelines, turn it into a function, and save it as a separate file we can bring into our existing app.py. It is good practice to keep these useful structures in separate files, this way your code can be more modular. Also, if you get paid by the number of files you create, it’ll really impress your employer.

Create `image_pipeline.py`

In your application folder, create a new file called image_pipeline.py.
Navigate to the Image Generation Pipelines course, go to the Displaying Images section, and scroll down to Code Review.
Copy the code from this section and paste it into the image_pipeline.py.

Wrap the code in a function

You can keep your code the same, but wrap it into a function. After your import statements are finished, create the image_pipeline function like this:

image_pipeline.py

# ...

def image_pipeline() -> Pipeline:
    # Variables
    output_dir = "images"

    # ... rest of your code

This says you’re creating the function, and returning a Pipeline structure.

Remove the Run statement

In your existing code you’re still running the pipeline, but we no longer need to do that - the Agent will run it when it gets called.

Delete the following lines from the bottom :

    # Run the pipeline
    pipeline.run("a cow")

Add a return

Finally, we need to return the pipeline structure. In the place where you had the Run the Pipeline code, replace it with a return statement.

image_pipeline.py

    # ...

    # Return the pipeline
    return pipeline

Code Review

That’s it for setting up your image pipeline script! Let’s review the code

image_pipeline.py
# Griptape
from griptape.artifacts import TextArtifact
from griptape.drivers import OpenAiImageGenerationDriver
from griptape.structures import Pipeline
from griptape.tasks import (
    CodeExecutionTask,
    PromptImageGenerationTask,
    PromptTask,
)


def create_image_pipeline() -> Pipeline:
    # Variables
    output_dir = "images"

    # Create the driver
    image_driver = OpenAiImageGenerationDriver(model="dall-e-3", api_type="open_ai", image_size="1024x1024")

    # Create a function to display an image
    def display_image(task: CodeExecutionTask) -> TextArtifact:
        import os
        import subprocess
        import sys

        # Get the filename
        filename = task.input.value

        # Get the output_dir
        output_dir = task.context["output_dir"]

        # Get the path of the image
        image_path = os.path.join(output_dir, filename)

        # Open the image
        if sys.platform == "win32":
            os.startfile(image_path)
        elif sys.platform == "darwin":  # macOS
            subprocess.run(["open", image_path])
        else:  # linux variants
            subprocess.run(["xdg-open", image_path])

        return TextArtifact(image_path)

    # Create the pipeline object
    pipeline = Pipeline()

    # Create tasks
    create_prompt_task = PromptTask(
        """
        Create a prompt for an Image Generation pipeline for the following topic: 
        {{ args[0] }}
        in the style of {{ style }}.
        """,
        context={"style": "a polaroid photograph from the 1970s"},
        id="Create Prompt Task",
    )

    generate_image_task = PromptImageGenerationTask(
        "{{ parent_output }}",
        image_generation_driver=image_driver,
        output_dir=output_dir,
        id="Generate Image Task",
    )

    display_image_task = CodeExecutionTask(
        "{{ parent.output.name }}",
        context={"output_dir": output_dir},
        on_run=display_image,
        id="Display Image Task",
    )

    # Add tasks to pipeline
    pipeline.add_tasks(create_prompt_task, generate_image_task, display_image_task)

    # Return the pipeline
    return pipeline

Next Steps

In the next section, we’ll update our app to have an agent, bring in the image_generation pipeline, create the appropriate driver and client, and give that to the agent.