GoGrow | Nested GraphQL Queries in Python Using Strawberry

Since its public release in 2015, GraphQL has revolutionized how applications fetch and handle data, it has become a powerful alternative to traditional REST APIs by allowing clients to query the specific data they need. Its simplicity and ease of readability are values that align closely with Python, a versatile language with such extensive libraries that it’s hard not to see it as an ideal ground for implementing GraphQL frameworks, with Graphene being the most popular among them.

…But you’re not here to read about Python or GraphQL, are you?

No, you’re here because you’re stuck on a problem, likely to be the same one I encountered a couple of months ago. But bear with me just for a couple of paragraphs, let me give you some context.

In later years, new API frameworks like FastAPI and LiteStar have become popular among Python developers, gradually replacing older industry standards such as Django and Flask. Just recently I found myself migrating a 3 year old app from Flask to FastAPI, and as my research suggested, migrating all resources including the Graphene-based GraphQL endpoint would be a comfortable walk in the park.

Imagine my surprise, then, when I found out that Starlette (the toolkit FastAPI is built on) no longer supports the Graphene framework. Now in addition to migrating the app’s REST API framework I also had to replace the graphQL framework. That “walk in the park” from before had suddenly turned into a sprint in the middle of a thunderstorm.

Yay.

But, as my special other says, the sin lies not in being clumsy but in acting as if you were not. I may not be the brightest but I wasn’t foolish enough to try to migrate two frameworks simultaneously. So before I even started getting rid of those godforsaken Marshmallow schemas used in Flask, I needed to replace Graphene with an equally capable framework.

Enter Strawberry! As the second most popular GraphQL framework for Python, Strawberry offers similar simplicity and elegance, letting you write readable and maintainable code. The best part? It comes with built-in FastAPI support. So all should be good and dandy right? Not quite.

Now, on to the issue at hand: The Strawberry documentation, although immensely helpful through the initial steps, is missing a fundamental scenario that I need to address: Nested queries, which means being able to query a field from within another one, something like this:


{
  field_1 {
    some_field_1_property
    field_2 {
      some_field_2_property
    }
}

At a first glance it seems simple enough. However, I haven’t told you a tiny detail that might complicate things: The data I’ll be querying in GraphQL is stored in a PostgreSQL database with SQLAlchemy as the ORM.

So, to illustrate how you can handle nested queries in your GraphQL resources, let’s assume a simple scenario: Suppose we have a small blogging site that tracks how often blogs are being visited and which registered users, including blog authors, leave comments. This translates into two tables in our database, a users table and a blogs table. The corresponding SQLAlchemy models look like this:


from sqlalchemy.ext.declarative import declared_attr
from sqlalchemy.orm import as_declarative
from sqlalchemy import Boolean, Column, Integer, String, JSON, ForeignKey
from datetime import datetime
import uuid

@as_declarative()
class Base:
    __name__: str

    # To generate tablename from classname
    # And yes, if the model is named 'Fish' this would name the table 'fishs'
    # But why would you have a model named Fish?
    # No, seriously, I'm curious
    @declared_attr
    def __tablename__(cls) -> str:
        return f"{cls.__name__.lower()}s"

class User(Base):
    id = Column(Integer, primary_key=True, default=lambda: str(uuid.uuid4()))
    email = Column(String, nullable=False, unique=True, index=True)
    # You should never store passwords as plain text, this is just an example
    password = Column(String, nullable=False)
    created_at = Column(DateTime, default=datetime.now)
    last_login_date = Column(DateTime, default=datetime.now)
    blogs = relationship('Blog', cascade='all, delete',
                         back_populates='author', uselist=True)

class Blog(Base):
    id = Column(Integer, primary_key=True, default=lambda: str(uuid.uuid4()))
    title = Column(String, nullable=False)
    author_id = Column(String(36), ForeignKey('users.id'), nullable=False)
    author = relationship("User", back_populates='blogs')
    created_at = Column(DateTime, default=datetime.now)
    updated_at = Column(DateTime, default=datetime.now)
    meta_data = Column(JSON, nullable=True, none_as_null=True)

To test this independently, you’d need to create the corresponding tables via a library like Alembic, populate them with a few rows, and write a session maker for the database. I’ll leave a link at the end where you can find all the stuff you’d need to play with the code yourself.

Now, notice how each user can potentially have any number of blogs to their name, and every blog must have an author. The GraphQL schema should reflect this structure. But let’s not run before we can walk and set the basics first, let’s define the user and blog GraphQL models :


from typing import Optional, List
from datetime import datetime
from sqlalchemy.orm import Session
from strawberry import (
    type as strawberry_type, 
    field as strawberry_field,
    UNSET
)
from db.models import User as UserModel
from db.models import Blog as BlogModel
from db.session import get_db

@strawberry_type
class User():
    id: str
    email: str
    last_login_date: datetime

    @classmethod
    def marshal(cls, user: UserModel) -> "User":
        return cls(id=str(user.id),
                   email=user.email,
                   last_login_date=user.last_login_date)

@strawberry_type
class Blog:
    id: Optional[str] = UNSET
    title: Optional[str] = UNSET
    created_at: datetime
    updated_at: datetime

    @classmethod
    def marshal(cls, blog: BlogModel) -> "Blog":
        return cls(id=str(blog.id),
                   title=blog.title,
                   created_at=blog.created_at,
                   updated_at=blog.updated_at)

For now we’re just querying simple attributes from each table, with the marshal class method serving as the serializer from the GraphQL model to the query. We’ll need this method on every model we add to serialize a given table or field from a table. The models on their own won’t do much, though. We’ll need to compile them in a main query that can be mounted to an endpoint in our API (we’ll see how that’s done later). For simplicity, we only want to query for a single row from each table, given a row ID. We’d have something like this:


from typing import Optional
from sqlalchemy.orm import Session
from db.session import get_db
from db.models import User as UserModel, Blog as BlogModel
from graphql_app.models import User, Blog
from strawberry import Schema, type as strawberry_type, UNSET

@strawberry_type
class Query:
    @field
    def user(self,
             user_id: Optional[str] = UNSET) -> Optional["User"]:
        session: Session = next(get_db())
        query = session.query(UserModel)
        if user := query.filter(UserModel.id == user_id).first():
            return User.marshal(user)

    @field
    def blog(self,
             blog_id: Optional[str] = UNSET) -> Optional["Blog"]:
        session: Session = next(get_db())
        query = session.query(BlogModel)
        if blog := query.filter(BlogModel.id == blog_id).first():
            return Blog.marshal(blog)

query = Schema(query=Query, types=[User, Blog])

So far so good. We can query single users and blogs separately without issue, and for what we want to develop here no further changes to the main query are necessary. We will, however, need to modify each of the individual models. What if we want to query a blog’s author? That one’s easy. Since Strawberry forces us to add a type hint to every field in a model, we can also assign a model as a data type! We can simply add the attribute to the blog model and resolve it using the resolver from the user model, like so:


@strawberry_type
class Blog:
    id: Optional[str] = UNSET
    title: Optional[str] = UNSET
    created_at: datetime
    updated_at: datetime
    author: User

    @classmethod
    def marshal(cls, blog: BlogModel) -> "Blog":
        return cls(id=str(blog.id),
                   title=blog.title,
                   created_at=blog.created_at,
                   updated_at=blog.updated_at,
                   author=User.marshal(blog.author))

Things get more complicated if we want to query all blogs authored by a given user though. We’ll hit a NameError (because one class is defined after the other) or run into a circular import if we put both classes in separate files. To get around this, let’s re-organize our models a bit:


@strawberry_type
class BaseUser:
    id: str
    email: str
    last_login_date: datetime

    @classmethod
    def marshal(cls, user: UserModel) -> "User":
        return cls(id=str(user.id),
                   email=user.email,
                   last_login_date=user.last_login_date)

@strawberry_type
class Blog:
    id: Optional[str] = UNSET
    title: Optional[str] = UNSET
    created_at: datetime
    updated_at: datetime
    author: BaseUser

    @classmethod
    def marshal(cls, blog: BlogModel) -> "Blog":
        return cls(id=str(blog.id),
                   title=blog.title,
                   created_at=blog.created_at,
                   updated_at=blog.updated_at,
                   author=BaseUser.marshal(blog.author))

@strawberry_type
class User(BaseUser): ...

Nothing has changed in terms of functionality, but now we have a separate class with the basic attributes of the User model that can be used to fulfill the need to serialize a user row in the Blog model. Furthermore, can add any fields we need to the User model. You might be tempted to follow the previous strategy and add the attribute and resolve in the class’s marshal method, but that won’t work. Instead, you’ll need to add a new field to the model and query the rows you wish to resolve directly from the database:


@strawberry_type
 class User(BaseUser):
    @field
    def blogs(self) -> List[Blog]:
        session: Session = next(get_db())
        blogs = session.query(BlogModel) \
                       .filter(BlogModel.author_id == self.id) \
                       .all()
        return [Blog.marshal(blog) for blog in blogs]

Let’s kick it up a notch. What if we want to query a JSON field? Suppose we have a column named meta_data in the blogs table where we store information related to the number of visits a blog has had, the date of the last visit and a list of registered users who have left comments, something like this:


{
  "visitCount": 10,
  "lastVisitTimeStamp": "2024-02-17T16:34:56.12",
  "commentingUsers": 
    [
      "user_1",
      "user_2"
    ]
}

Strawberry provides a scalar function that can be used to define any custom data you need from base64 to JSON fields. Luckily Stawberry comes with a built-in JSON scalar, which gets us something like this in the blog model:


from strawberry.scalars import JSON

@strawberry_type
class Blog:
    id: Optional[str] = UNSET
    title: Optional[str] = UNSET
    created_at: datetime
    updated_at: datetime
    author: BaseUser
    meta_data: JSON

    @classmethod
    def marshal(cls, blog: BlogModel) -> "Blog":
        return cls(id=str(blog.id),
                   title=blog.title,
                   created_at=blog.created_at,
                   updated_at=blog.updated_at,
                   author=BaseUser.marshal(blog.author),
                   meta_data=blog.meta_data)

While this method will work for any JSON field you might need to serialize, there’s a drawback. Let’s see what we get when we try to query it:

Can you spot it? Yes, we get all of the fields within that JSON, but we always get all of the field’s content whenever we include it in our query. So, begs the cynical question, what if we want to query the specific fields contained in that JSON? The philosophy behind GraphQL is all about clients being able to query the specific data they need after all. The beauty of the solution lies in its simplicity: let’s make another model for that field!


@strawberry_type
class MetaData:
    visit_count: int
    last_visit: datetime
    commenting_users: List[str]

    @classmethod
    def marshal(cls, meta_data: Dict[str, Any]) -> "MetaData":
        return cls(visit_count=meta_data.get("visitCount"),
                   last_visit=datetime.fromisoformat(
                       meta_data.get("lastVisitTimeStamp")),
                   commenting_users=meta_data.get("commentingUsers"))

@strawberry_type
class Blog:
    id: Optional[str] = UNSET
    title: Optional[str] = UNSET
    created_at: datetime
    updated_at: datetime
    author: BaseUser
    meta_data: MetaData

    @classmethod
    def marshal(cls, blog: BlogModel) -> "Blog":
        return cls(id=str(blog.id),
                   title=blog.title,
                   created_at=blog.created_at,
                   updated_at=blog.updated_at,
                   author=BaseUser.marshal(blog.author),
                   meta_data=MetaData.marshal(blog.meta_data))

And that’s it! Since the JSON field is now serialized by a separate Strawberry type, it behaves just like the other models and we can query each of its fields separately. If you’re working with FastAPI, here’s how to mount the GraphQL app to an endpoint:


from fastapi import FastAPI
from core.config import settings
from db.session import engine 
from db.models import Base
from strawberry.fastapi import GraphQLRouter
from graphql_app.schema import query

graphql_app = GraphQLRouter(query)


def start_application():
    app = FastAPI(title=settings.PROJECT_NAME,
                  version=settings.PROJECT_VERSION)
    Base.metadata.create_all(bind=engine)
    app.include_router(graphql_app, prefix='/graphql-query')

    return app

To end on a high note, let’s see what these nested queries look like in action:

While we’ve worked with a simple scenario of low complexity, this can go deep. You can nest as many queries as you need following this technique while always keeping your code readable and easy to troubleshoot.

Here’s a link with all of the files needed to run the models in this blog.

Unlock forbidden knowledge