Documentation menu

Graph-sitter

Graph-sitter is a Python library for manipulating codebases.

It provides a scriptable interface to a powerful, multi-lingual language server built on top of Tree-sitter.

from graph_sitter import Codebase
 
# Graph-sitter builds a complete graph connecting
# functions, classes, imports and their relationships
codebase = Codebase("./")
 
# Work with code without dealing with syntax trees or parsing
for function in codebase.functions:
    # Comprehensive static analysis for references, dependencies, etc.
    if not function.usages:
        # Updates references and imports through graph-aware edit APIs
        function.remove()
 
# Fast, in-memory code index
codebase.commit()

Graph-sitter is designed for graph-aware refactors and codebase analysis. See correctness and parity for the current tested scope and known limits.

Graph-sitter works with both Python and Typescript/JSX codebases. Learn more about language support here.

Quick Start

Graph-sitter requires Python 3.12 - 3.13 (recommended: Python 3.13+).

uv tool install graph-sitter --python 3.13

Using Pipx

Pipx is not officially supported by Codegen, but it should still work.

pipx install graph-sitter

For further & more in depth installation instructions, see the installation guide.

What can I do with Graph-sitter?

Graph-sitter's simple yet powerful APIs enable a range of applications, including:

See below for an example call graph visualization generated with Graph-sitter.

Call graph visualization for modal/modal-client/_Client

View source code on modal/modal-client. View codemod on codegen.sh

Get Started

Get Started

Follow our step-by-step tutorial to start manipulating code with Graph-sitter.

Tutorials

Learn how to use Graph-sitter for common code transformation tasks.

View on GitHub

Star us on GitHub and contribute to the project.

Join our Slack

Get help and connect with the Graph-sitter community.

Why Graph-sitter?

Many software engineering tasks - refactors, enforcing patterns, analyzing control flow, etc. - are fundamentally programmatic operations. Yet the tools we use to express these transformations often feel disconnected from how we think about code.

Graph-sitter was engineered backwards from real-world refactors we performed for enterprises at Codegen, Inc.. Instead of starting with theoretical abstractions, we built the set of APIs that map directly to how humans and AI think about code changes:

  • Natural Mental Model: Express transformations through high-level operations that match how you reason about code changes, not low-level text or AST manipulation.
  • Clean Business Logic: Let the engine handle the complexities of imports, references, and cross-file dependencies.
  • Scale with Evidence: Make sweeping changes across large codebases using tested Python, TypeScript, JavaScript, and React workflows. See the large-repo benchmarks for current Airflow and Next.js proof.

As AI becomes increasingly sophisticated, we're seeing a fascinating shift: AI agents aren't bottlenecked by their ability to understand code or generate solutions. Instead, they're limited by their ability to efficiently manipulate codebases. The challenge isn't the "brain" - it's the "hands."

We built Graph-sitter with a key insight: future AI agents will need to "act via code," building their own sophisticated tools for code manipulation. Rather than generating diffs or making direct text changes, these agents will:

  1. Express transformations as composable programs
  2. Build higher-level tools by combining primitive operations
  3. Create and maintain their own abstractions for common patterns

This creates a shared language that both humans and AI can reason about effectively, making code changes more predictable, reviewable, and maintainable. Whether you're a developer writing a complex refactoring script or an AI agent building transformation tools, Graph-sitter provides the foundation for expressing code changes as they should be: through code itself.