Ways to Pull Repos into LLM Context

LLMs

Agents

SolveIt

Ways to let LLMs know about a GitHub repo or a local repo.

Author

Salman Naqvi

Published

Sunday, 25 January 2026

Good context is key when working with LLMs, no matter if you’re coding, writing, planning, problem solving, and what not. The LLM needs to know your context to you help you effectively.

This guide will explore different ways one can pull a GitHub repo or a local repo in particular into a format LLMs can use.

GitHubq

GitHubq is a web service from Answer.Ai that allows you to easily make GitHub repos available as context for LLMs. The basic way to use the service is to change github.com to githubq.com.

Viewing the Shell Sage GitHub repo as LLM context

You can change what you get back with various parameters. For instance,

exts=py returns only files ending with .py
sigs_only=True returns only function signatures and their docstrings.

Only Python files with function signatures are visible

`toolslm.xml`

See the full documentation of Answer.AI’s toolslm library here.

`repo2ctx`

Githubq is great if you want to paste a link to a LLM that will then retrieve what’s on the link, or if you want to save one-off files here and there. What if you want to programmatically work with the context? That’s where toolslm, and more specifically the toolslm.xml module comes in.

from toolslm.xml import repo2ctx
print(repo2ctx('https://github.com/AnswerDotAI/shell_sage', exts='py')[:100])

<documents title="GitHub repository contents from AnswerDotAI/shell_sage/main (filters applied -- ex

Behind the scenes, GitHubq uses repo2ctx. So parameters from repo2ctx should work as query parameters when accessing GitHubq.

?repo2ctx

def repo2ctx(
    owner:str, # GitHub repo owner or "owner/repo" or a full github URL
    repo:str=None, # GitHub repo name (leave empty if using "owner/repo" or URL format for owner param)
    ref:str=None, # Git ref (branch/tag/sha) (get from URL not provided); defaults to repo's default branch
    folder:str=None, # Only include files under this path (get from URL not provided)
    show_filters:bool=True, # Include filter info in title?
    token:str=None, # GitHub token (uses GITHUB_TOKEN env var if None)
    prefix:bool=False, # Include Anthropic's suggested prose intro?
    out:bool=True, # Include notebook cell outputs?
    include_base:bool=True, # Include full path in src?
    title:str=None, # Optional title attr for Documents element
    max_size:int=100000, # Skip files larger than this (bytes)
    max_total:int=10000000, # Max total output size in bytes
    readme_first:bool=False, # Prioritize README files at start of context?
    files_only:bool=False, # Return dict of {filename: size} instead of context?
    sigs_only:bool=False, # Return signatures instead of full text for python files?
    ids:bool=True, # Include cell ids in notebooks?
    recursive:bool=True, # search subfolders
    symlinks:bool=True, # follow symlinks?
    file_glob:str=None, # Only include files matching glob
    file_re:str=None, # Only include files matching regex
    folder_re:str=None, # Only enter folders matching regex
    skip_file_glob:str=None, # Skip files matching glob
    skip_file_re:str=None, # Skip files matching regex
    skip_folder_re:str=None, # Skip folders matching regex,
    ret_folders:bool=False, # return folders, not just files
    sort:bool=True, # sort files by name within each folder
    types:str \| list=None, # list or comma-separated str of ext types from: py, js, java, c, cpp, rb, r, ex, sh, web, doc, cfg
    exts:str \| list=None, # list or comma-separated str of exts to include
)->Union: # XML for LM context, or dict of file sizes

Convert GitHub repo to XML context without cloning

File: ~/.local/lib/python3.12/site-packages/toolslm/xml.py

Type: function

!ls

def folder2ctx(
    path:Union, # Folder to read
    prefix:bool=False, # Include Anthropic's suggested prose intro?
    out:bool=True, # Include notebook cell outputs?
    include_base:bool=True, # Include full path in src?
    title:str=None, # Optional title attr for Documents element
    max_size:int=100000, # Skip files larger than this (bytes)
    max_total:int=10000000, # Max total output size in bytes
    readme_first:bool=False, # Prioritize README files at start of context?
    files_only:bool=False, # Return dict of {filename: size} instead of context?
    sigs_only:bool=False, # Return signatures instead of full text for python files?
    ids:bool=True, # Include cell ids in notebooks?
    recursive:bool=True, # search subfolders
    symlinks:bool=True, # follow symlinks?
    file_glob:str=None, # Only include files matching glob
    file_re:str=None, # Only include files matching regex
    folder_re:str=None, # Only enter folders matching regex
    skip_file_glob:str=None, # Skip files matching glob
    skip_file_re:str=None, # Skip files matching regex
    skip_folder_re:str=None, # Skip folders matching regex,
    ret_folders:bool=False, # return folders, not just files
    sort:bool=True, # sort files by name within each folder
    types:str \| list=None, # list or comma-separated str of ext types from: py, js, java, c, cpp, rb, r, ex, sh, web, doc, cfg
    exts:str \| list=None, # list or comma-separated str of exts to include
)->Union:

Convert folder contents to XML context, handling notebooks

File: ~/.local/lib/python3.12/site-packages/toolslm/xml.py

Type: function

`folder2ctx`

Now this is great for remote repositories. If you have a local folder, you can turn that into context with folder2ctx.

!ls

'00-09 系统'   70ec03a05ceeffe770a8d00297fa241a   solveit_settings.json
'20-29 大学'   jobs.csv                 test
'30–39 学习'   shell_sage           暂时对话

from toolslm.xml import folder2ctx
print(folder2ctx('shell_sage', exts='py')[:250])

<documents><document index="1"><src>
shell_sage/setup.py
</src><document-content>
from pkg_resources import parse_version
from configparser import ConfigParser
import setuptools, shlex
assert parse_version(setuptools.__version__)>=parse_version('36.2

`py2sigs`

py2sigs can let us directly get function signature from a given Python file.

from toolslm.xml import py2sigs
print(py2sigs('shell_sage/shell_sage/core.py')[:250])

def __rich_console__(self:CodeBlock, console, options):

def Chat(*arg, **kw):
    "Lazy load lisette to make ssage more responsive"

def _aliases(shell):

def _sys_info():

def get_pane(n, pid=None):
    "Get output from a tmux pane"

def get_panes(

`sym2pkgctx`

If you have a package already installed, you can use toolslm’s sym2pkgctx to pull that package in as context. The entire root package is returned.

print(sym2pkgctx(sc.ShellSageConfig)[:250])

<documents><document index="1"><src>
/usr/local/lib/python3.12/site-packages/shell_sage/__init__.py
</src><document-content>
__version__ = "1.0.4"
</document-content></document><document index="2"><src>
/usr/local/lib/python3.12/site-packages/shell_s

`sym2folderctx`

sym2folderctx is similar to sym2pkgctx. However, the difference is that the former returns the returns the context of the folder where the symbol is defined.

from toolslm.xml import sym2folderctx
print(sym2folderctx(sc.ShellSageConfig)[:250])

<documents><document index="1"><src>
/usr/local/lib/python3.12/site-packages/shell_sage/__init__.py
</src><document-content>
__version__ = "1.0.4"
</document-content></document><document index="2"><src>
/usr/local/lib/python3.12/site-packages/shell_s

?sym2pkgctx

def sym2pkgctx(
    sym,
    types:str \| list='py', # list or comma-separated str of ext types from: py, js, java, c, cpp, rb, r, ex, sh, web, doc, cfg
    skip_file_re:str='^_mod', prefix:bool=False, # Include Anthropic's suggested prose intro?
    out:bool=True, # Include notebook cell outputs?
    include_base:bool=True, # Include full path in src?
    title:str=None, # Optional title attr for Documents element
    max_size:int=100000, # Skip files larger than this (bytes)
    max_total:int=10000000, # Max total output size in bytes
    readme_first:bool=False, # Prioritize README files at start of context?
    files_only:bool=False, # Return dict of {filename: size} instead of context?
    sigs_only:bool=False, # Return signatures instead of full text for python files?
    ids:bool=True, # Include cell ids in notebooks?
    recursive:bool=True, # search subfolders
    symlinks:bool=True, # follow symlinks?
    file_glob:str=None, # Only include files matching glob
    file_re:str=None, # Only include files matching regex
    folder_re:str=None, # Only enter folders matching regex
    skip_file_glob:str=None, # Skip files matching glob
    skip_folder_re:str=None, # Skip folders matching regex,
    ret_folders:bool=False, # return folders, not just files
    sort:bool=True, # sort files by name within each folder
    exts:str \| list=None, # list or comma-separated str of exts to include
):

Return repo context for a symbol's root package

File: ~/.local/lib/python3.12/site-packages/toolslm/xml.py

Type: function

`contextkit`

Next up is contextkit’s read module. However, this approach is much more hands on as the module’s functions do not return XML formatted contents. Though this does allow you to perform your own processing/handling. You will need to have a SSH key set up between your machine and GitHub as the functions in read create temporary clones of the GitHub repo.

contextkit.reads read_gh_repo returns a dictionary in the format of {'path': 'contents'}.

import contextkit.read as rd
rd.read_gh_repo('https://github.com/AnswerDotAI/shell_sage').keys()

dict_keys(['/app/data/.cache/contextkit_git_clones/shell_sage/settings.ini', '/app/data/.cache/contextkit_git_clones/shell_sage/CONTRIBUTING.md', '/app/data/.cache/contextkit_git_clones/shell_sage/LICENSE', '/app/data/.cache/contextkit_git_clones/shell_sage/README.md', '/app/data/.cache/contextkit_git_clones/shell_sage/pyproject.toml', '/app/data/.cache/contextkit_git_clones/shell_sage/uv.lock', '/app/data/.cache/contextkit_git_clones/shell_sage/MANIFEST.in', '/app/data/.cache/contextkit_git_clones/shell_sage/setup.py', '/app/data/.cache/contextkit_git_clones/shell_sage/CHANGELOG.md', '/app/data/.cache/contextkit_git_clones/shell_sage/ipython/00-init.py', '/app/data/.cache/contextkit_git_clones/shell_sage/nbs/_quarto.yml', '/app/data/.cache/contextkit_git_clones/shell_sage/nbs/CNAME', '/app/data/.cache/contextkit_git_clones/shell_sage/nbs/sidebar.yml', '/app/data/.cache/contextkit_git_clones/shell_sage/nbs/tmux.conf', '/app/data/.cache/contextkit_git_clones/shell_sage/nbs/01_config.ipynb', '/app/data/.cache/contextkit_git_clones/shell_sage/nbs/00_core.ipynb', '/app/data/.cache/contextkit_git_clones/shell_sage/nbs/styles.css', '/app/data/.cache/contextkit_git_clones/shell_sage/nbs/index.ipynb', '/app/data/.cache/contextkit_git_clones/shell_sage/nbs/nbdev.yml', '/app/data/.cache/contextkit_git_clones/shell_sage/shell_sage/core.py', '/app/data/.cache/contextkit_git_clones/shell_sage/shell_sage/__init__.py', '/app/data/.cache/contextkit_git_clones/shell_sage/shell_sage/_modidx.py', '/app/data/.cache/contextkit_git_clones/shell_sage/shell_sage/config.py', '/app/data/.cache/contextkit_git_clones/shell_sage/debug/aliases.py', '/app/data/.cache/contextkit_git_clones/shell_sage/debug/safe_aliases.py', '/app/data/.cache/contextkit_git_clones/shell_sage/debug/probe_tty.py'])

read_gh_file returns the contents of the file as a string.

rd.read_gh_file('https://github.com/AnswerDotAI/shell_sage/blob/main/shell_sage/core.py')[:250]

"# AUTOGENERATED! DO NOT EDIT! File to edit: ../nbs/00_core.ipynb.\n\n# %% auto 0\n__all__ = ['console', 'print', 'sp', 'ssp', 'default_cfg', 'tools', 'sps', 'log_path', 'Chat', 'get_pane', 'get_panes',\n           'tmux_history_lim', 'get_history', 'get_"

print(rd.read_gh_file('https://github.com/AnswerDotAI/shell_sage/blob/main/shell_sage/core.py')[:250])

# AUTOGENERATED! DO NOT EDIT! File to edit: ../nbs/00_core.ipynb.

# %% auto 0
__all__ = ['console', 'print', 'sp', 'ssp', 'default_cfg', 'tools', 'sps', 'log_path', 'Chat', 'get_pane', 'get_panes',
           'tmux_history_lim', 'get_history', 'get_

See the full documentation of Answer.AI’s contextkit library here.

SolveIt

If you’re in SolveIt, you can also change directory and ask the LLM to directly read through a particular directory for you with its builtin tools. Note that you should use % rather than ! so the directory switches in a non-temporary manner.

%cd shell_sage

/app/data/shell_sage

Can you look through the directory and tell me what are the key functions of the shell sage library?

🤖Reply🤖

🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠

让我看看这个目录的结构：

view(path=.)

{
  "id": "toolu_01TcJme7ugD8Xzq3rBZLjinR",
  "call": {
    "function": "view",
    "arguments": {
      "path": "."
    }
  },
  "result": "'Directory contents of /app/<TRUNCATED>"
}

主要代码在 shell_sage/ 文件夹里。让我看看 core.py 的签名：

view(path=shell_sage/core.py, view_range=[1, 50])

{
  "id": "toolu_01N9i48X1GWBFrBj9awo8qzk",
  "call": {
    "function": "view",
    "arguments": {
      "path": "shell_sage/core.py",
      "view_range": "[1, 50]"
    }
  },
  "result": "'# AUTOGENERATED! DO NOT EDIT! File to edit: ../nbs/00_core.ipynb.\\n\\n# %% auto 0\\n__all__ = [\\'console\\', \\'print\\', \\'sp\\', \\'ssp\\', \\'default_cfg\\', \\'tools\\', \\'sps\\', \\'log_path\\', \\'Chat\\', \\'get_pane\\', \\'get_panes\\',\\n           \\'tmux_history_lim\\', \\'get_history\\', \\'get_opts\\', \\'with_permission\\', \\'get_sage\\', \\'get_res\\', \\'Log\\', \\'mk_db\\',\\n           \\'main\\', \\'extract_cf\\', \\'extract\\']\\n\\n# %% ../nbs/00_core.ipynb 3\\nfrom datetime import datetime\\nfrom itertools import accumulate\\nfrom fastcore.script import *\\nfrom fastcore.tools import *\\nfrom fastcore.utils import *\\nfro<TRUNCATED>"
}

Shell Sage 的主要功能：

根据 __all__ 导出列表，关键函数包括：

函数	用途
`main`	主入口点（CLI 命令 `ssage`）
`get_pane`, `get_panes`	获取 tmux 窗格信息
`get_history`	获取 tmux 历史记录
`get_sage`	获取 AI 响应
`Chat`	延迟加载 LLM 聊天功能
`Log`, `mk_db`	日志和数据库功能
`extract`, `extract_cf`	从响应中提取代码

简单说：Shell Sage 是一个 AI 命令行助手，它能读取你的 tmux 终端历史，然后用 LLM 帮你解决 shell 问题。

You can of course also directly paste in a URL and the LLM will look through it.

What’s this GitHub repo about? https://github.com/AnswerDotAI/shell_sage

🤖Reply🤖

🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠🧠

read_url(url=https://github.com/AnswerDotAI/shell_sage)

{
  "id": "toolu_011DD4XT7Yge4mwqi49s3u6Z",
  "call": {
    "function": "read_url",
    "arguments": {
      "url": "https://github.com/AnswerDotAI/shell_sage"
    }
  },
  "result": "'# AnswerDotAI/shell_sage\\nShellSage saves sysadmins\u2019 sanity by solving shell script snafus super swiftly\\n\\n## / Files\\n- \ud83d\udcc1 .github\\n- .gitignore\\n- CHANGELOG.md\\n- CONTRIBUTING.md\\n- LICENSE\\n- MANIFEST.in\\n- README.md\\n- \ud83d\udcc1 debug\\n- \ud83d\udcc1 ipython\\n- \ud83d\udcc1 nbs\\n- pyproject.toml\\n- \ud83d\udcc1 screenshots\\n- settings.ini\\n- setup.py\\n- \ud83d\udcc1 shell_sage\\n- uv.lock\\n\\n## README\\n# ShellSage\\n\\n\\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\\n\\nShellSage is an AI-powered command-line assistant that integrates\\nseamlessly with your terminal workflow through tmux. It provides\\ncontextual help for shell ope<TRUNCATED>"
}

Shell Sage 是一个 AI 驱动的命令行助手，通过 tmux 集成到你的终端工作流中。

主要特点： - 用 ssage 命令在终端里问问题 - 能自动读取 tmux 历史记录来提供上下文帮助 - 支持多种 LLM 提供商（Claude、GPT、Ollama 等） - 可以分析管道输入（piped input） - 有个有趣的 “sassy mode”（GLaDOS 风格的回复）

安装：

pip install shell-sage

At an Overview

Method	Input	Best For
`githubq`	Change URL to githubq.com	Quick browser-based access
`repo2ctx`	GitHub URL	Programmatic access to remote repos
`folder2ctx`	Local path	Processing local folders
`py2sigs`	Python file path	Single file signatures
`contextkit.read_gh_repo`	GitHub URL	Raw dict of full repo contents
`contextkit.read_gh_file`	File URL	Single file contents
`sym2pkgctx`	Class/function object	Installed packages
`sym2folderctx`	Class/function object	Folder containing a symbol
`read_url` (SolveIt)	GitHub URL	Quick repo overview + README

Other `toolslm.xml` Goodies

Beyond the functions we’ve covered, toolslm.xml has other interesting functions:

Additional Functions: - files2ctx — Pass in a specific list of files rather than an entire folder - nb2xml — Convert Jupyter notebooks to XML format - json_to_xml — Convert JSON or Python dicts to XML - sym2file — Get the source file contents for any class or function

Useful Parameters: - prefix=True — Prepends Anthropic’s recommended preamble for document context - nums=True — Adds line numbers to code (handy for referencing specific lines)

Command Line Support: Both folder2ctx and repo2ctx are available as CLI tools, so you can use them directly from your terminal without writing Python.

There’s plenty more to discover. Explore the full module at the docs by pulling it into a LLM with the techniques we’ve learned here!

Conclusion

If you have any comments, questions, suggestions, feedback, criticisms, or corrections, please do post them down in the comment section below!

GitHubq

toolslm.xml

repo2ctx

folder2ctx

py2sigs

sym2pkgctx

sym2folderctx

contextkit

SolveIt

🤖Reply🤖

🤖Reply🤖

At an Overview

Other toolslm.xml Goodies

Conclusion

`toolslm.xml`

`repo2ctx`

`folder2ctx`

`py2sigs`

`sym2pkgctx`

`sym2folderctx`

`contextkit`

Other `toolslm.xml` Goodies