Danel Kurka·May 29, 2026·9 min read

LangGraph in production: when it beats the vanilla SDK

Most LangGraph tutorials skip the honest part, so let me start there. LangGraph is an orchestration library from the LangChain team that lets you describe an agent as a state graph: nodes do the work, edges decide where to go next, and state flows between them. You reach for it when a single "question → answer" pass is not enough and you start needing loops, branching, and memory.

What LangGraph actually is

Under the hood, LangGraph is a plain directed graph sitting on top of any LLM. Instead of writing an endless while loop stuffed withif statements, you describe three things:

State — the object that flows through the whole process. Usually messages, intermediate results, counters, flags.
Nodes — functions that take state and return an update to it. A model call, a tool call, a parse step — each is a node.
Edges — the transitions. They are either direct (A always goes to B) or conditional (A goes to B or C depending on state). Conditional edges are what give you branches and loops.

The core value: the graph is deterministic and transparent. You can see every possible path on a diagram, checkpoint at any point, rewind, and replay a run. For debugging an agent in production that is the difference between "I will sort this out in an hour" and "I will sort this out in three days".

When the vanilla SDK is enough

Bluntly and up front: most agents do not need LangGraph. If you have one pass, a handful of tools, and linear logic, the vanilla OpenAI SDK (or Anthropic SDK) with tool-calling in a simple loop does the job and drags in zero extra dependencies.

Stay on the vanilla SDK when:

The flow is linear: request → call 1-2 tools → answer. No branches that loop back on themselves.
State fits in the message history and you do not need to persist it across sessions for days.
There is no human-in-the-loop: the agent never pauses mid-process to wait for a person to approve something.
One agent, one domain, up to 8 tools. Here a graph is ceremony for the sake of ceremony.

I say this straight to clients: if your case closes in 80 lines on the vanilla SDK, adding LangGraph is added complexity with no payoff. A framework should show up once the pain is real, not "just in case".

When LangGraph wins

Here is where the vanilla SDK turns into a web of ifstatements and LangGraph genuinely pays off:

Loops. The agent tries, checks the result, retries until it passes validation. On the SDK that is a hand-rolled loop with counters; in LangGraph it is an edge that points a node back at itself.
Branching. "If the request is about billing, take the billing branch; if technical, take the support branch." A conditional edge instead of a wall of nested if/else.
Human-in-the-loop. The graph pauses on a node, waits for a human to approve, then resumes from the same state. This is a built-in capability, not a hack.
Persistence. A checkpointer writes state to Postgres or Redis. The process can live for hours or days, survive a server restart, and continue from where it stopped.
Multi-agent handoff. Several agents pass work to each other through shared state. An orchestrator node decides who runs next. Details in the multi-agent systems article →

LangGraph example: a minimal StateGraph

The shortest tutorial that actually explains something. A two-node graph: one calls the model, the other decides whether another pass is needed. Pseudo-Python close to the real LangGraph API:

from langgraph.graph import StateGraph, START, END
from typing import TypedDict

class State(TypedDict):
    question: str
    answer: str
    tries: int

def call_model(state: State) -> dict:
    # a real LLM call goes here; simplified
    reply = llm.invoke(state["question"])
    return {"answer": reply, "tries": state["tries"] + 1}

def is_good(state: State) -> str:
    # conditional edge: retry or finish
    if "i don't know" in state["answer"] and state["tries"] < 3:
        return "retry"
    return "done"

graph = StateGraph(State)
graph.add_node("call_model", call_model)
graph.add_edge(START, "call_model")
graph.add_conditional_edges(
    "call_model", is_good, {"retry": "call_model", "done": END}
)

app = graph.compile()
result = app.invoke({"question": "What is LangGraph?", "tries": 0})

Forty lines and you already have a retry loop with an exit condition. On the vanilla SDK that is the same thing, but the transition logic is smeared through the loop body. Here it is lifted into the graph: retry points the node at itself, done goes to END. Add a checkpointer and this same graph survives a restart.

LangGraph vs OpenAI SDK: the short table

Vanilla OpenAI SDK wins

One pass, linear logic
Up to 8 tools, one domain
State lives in the message history
Prototype, MVP, quick idea check
Team does not want a new dependency

LangGraph wins

Loops, branches, retries
Human-in-the-loop with pauses
State persisted for days
Multi-agent work handoff
You need replay and transparent debug

Note: LangGraph does not replace the SDK, it sits on top of it. Inside the nodes you still call the same model with the same client. LangGraph only takes over orchestration of transitions and state.

LangGraph vs n8n: not competitors

The question almost everyone asks: "why LangGraph if I have n8n?" They are different layers. n8n is no-code integration orchestration: drag nodes, wire a webhook to an API, move data from point A to point B. LangGraph is code-level agent reasoning: loops, state, model decisions inside the process.

In practice they compose nicely. A typical setup: n8n catches an event (a new lead, an email, a message), posts it to an HTTP endpoint backed by a LangGraph agent, the agent reasons and returns a result, and n8n fans it back out across CRM, Slack, and email. n8n is the "where and when"; LangGraph is the "how to think". Not a choice between them, but the right place for each.

What this looks like in real production

A classic case from my own work: a support agent for an e-commerce shop handling requests in Telegram. It started as a single pass on the vanilla SDK, and it worked. The pain began once we added returns: the agent had to gather order data, check the returns policy, wait for a manager's approval on amounts above a threshold, and only then process the refund.

That is a textbook human-in-the-loop with an hour or two of pause. On a loop full of if statements it turned into a nightmare with state in global variables. We rewrote it on LangGraph: a "gather data" node, a "check policy" node, an approval pause via a Postgres checkpointer, then a "process refund" node. State survives a restart, every run is visible in the trace, and the regressions are gone.

The stack I build for this kind of work: LangGraph for orchestration, Claude or GPT as the reasoning layer, Postgres as the checkpointer and memory, Telegram Bot API as the frontend. If you want to know which model to put behind which node, I broke that down in the model decision matrix →

I build these agents end to end — from spec to deploy and monitoring — as part of AI agent development →

Common LangGraph mistakes

Reaching for LangGraph on a linear case. No loops, branches, or pauses means it is needless complexity. Vanilla SDK first; the graph comes when the pain shows up.
Cramming everything into one giant state. State should be minimal and explicit. A bloated state makes the graph as unreadable as a bloated prompt does.
Forgetting the checkpointer. Without it you lose the main advantage — persistence and replay. At that point the SDK really would have been simpler.
Confusing LangGraph with LangChain. They are different things. LangChain is abstractions over prompts and chains. LangGraph is the state graph. You can use LangGraph and never touch the rest of LangChain.

Should you use LangGraph?

An honest one-minute test. If your agent does a single pass, never pauses for a human, does not persist state for days, and has up to 8 tools — use the vanilla OpenAI SDK and do not overcomplicate it. If loops, branches, approval pauses, or several agents handing off work have shown up, then LangGraph pays for itself on the very first serious debug.

I build both, and I am quick to recommend the simpler one. If you are not sure which side of the line you are on, message me — 30 minutes on a call and I will tell you honestly whether you need a graph at all.

Message @tribeofdanel →