Skip to content

Instantly share code, notes, and snippets.

@prescod
Created February 16, 2026 17:15
Show Gist options
  • Select an option

  • Save prescod/6ad5b680cb7b9e8b29aa81170d390d31 to your computer and use it in GitHub Desktop.

Select an option

Save prescod/6ad5b680cb7b9e8b29aa81170d390d31 to your computer and use it in GitHub Desktop.
Snowfakery Recipe for Git Data Model
# Snowfakery recipe for Git's core data model
# Generates: Objects (commits, trees, blobs, tags), References, Index entries, and Reflogs
# ============================================
# OBJECTS
# ============================================
# Blob - stores file contents
- object: Blob
count: 20
fields:
object_id:
fake: sha1
type: blob
filename:
fake: file_name
content:
fake: text
size:
random_number:
min: 100
max: 50000
# Tree - represents a directory
- object: Tree
count: 10
fields:
object_id:
fake: sha1
type: tree
# TreeEntry - entries within a tree (separate table for proper relationships)
- object: TreeEntry
count: 30
fields:
tree_id:
reference: Tree
file_mode:
random_choice:
- "100644" # regular file
- "100755" # executable file
- "120000" # symbolic link
- "040000" # directory
object_type:
random_choice:
- blob
- tree
filename:
fake: file_name
referenced_object_id:
fake: sha1
# Commit - a snapshot of the repository
- object: Commit
count: 15
fields:
object_id:
fake: sha1
type: commit
tree_id:
reference: Tree
parent_commit_id:
random_choice:
- choice:
probability: 10
pick: null # Root commits have no parent
- choice:
probability: 90
pick:
fake: sha1
author_name:
fake: name
author_email:
fake: email
author_timestamp:
fake: unix_time
author_timezone: "-0400"
committer_name:
fake: name
committer_email:
fake: email
committer_timestamp:
fake: unix_time
committer_timezone: "-0400"
message:
fake: sentence
# Tag Object - annotated tag metadata
- object: TagObject
count: 5
fields:
object_id:
fake: sha1
type: tag
target_id:
reference: Commit
target_type: commit
tag_name:
fake.text:
max_nb_chars: 10
tagger_name:
fake: name
tagger_email:
fake: email
tagger_timestamp:
fake: unix_time
tagger_timezone: "-0400"
message:
fake: sentence
# ============================================
# REFERENCES
# ============================================
# Branch reference - points to a commit
- object: Branch
count: 5
fields:
name:
random_choice:
- main
- develop
- feature/login
- feature/signup
- bugfix/issue-123
ref_path: refs/heads/${{name}}
commit_id:
reference: Commit
# Tag reference - points to a commit or tag object
- object: Tag
count: 5
fields:
name: v${{fake.random_int(min=1, max=5)}}.${{fake.random_int(min=0, max=9)}}.${{fake.random_int(min=0, max=9)}}
ref_path: refs/tags/${{name}}
is_annotated:
random_choice:
- true
- false
target_id:
reference: Commit
# HEAD reference - current branch or detached commit
- object: HEAD
count: 1
fields:
is_detached:
random_choice:
- choice:
probability: 20
pick: true
- choice:
probability: 80
pick: false
symbolic_ref:
reference: Branch
commit_id:
reference: Commit
# Remote-tracking branch
- object: RemoteTrackingBranch
count: 5
fields:
remote: origin
branch_name:
random_choice:
- main
- develop
- feature/api
- feature/ui
- release/v2
ref_path: refs/remotes/${{remote}}/${{branch_name}}
commit_id:
reference: Commit
# ============================================
# INDEX (Staging Area)
# ============================================
- object: IndexEntry
count: 10
fields:
file_mode:
random_choice:
- "100644" # regular file
- "100755" # executable file
- "120000" # symbolic link
blob_id:
reference: Blob
stage_number:
random_choice:
- choice:
probability: 90
pick: 0 # Normal state
- choice:
probability: 10
pick:
random_number:
min: 1
max: 3 # Merge conflict states
directory:
random_choice:
- src
- lib
- tests
- docs
file_name:
fake: file_name
file_path: ${{directory}}/${{file_name}}
# ============================================
# REFLOGS
# ============================================
- object: ReflogEntry
count: 20
fields:
ref_name:
random_choice:
- HEAD
- refs/heads/main
- refs/heads/develop
commit_id:
reference: Commit
previous_commit_id:
reference: Commit
timestamp:
fake: unix_time
action:
random_choice:
- "commit: Initial commit"
- "commit: Add new feature"
- "commit (amend): Fix typo"
- "pull: Fast-forward"
- "checkout: moving from main to develop"
- "merge develop: Fast-forward"
- "reset: moving to HEAD~1"
- "rebase (finish): refs/heads/feature onto main"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment