Post

Some best practices for GitHub Copilot

Some best practices for GitHub Copilot

After use of GitHub Copilot across various personal projects, I’ve compiled practical insights on what works, what doesn’t, and how to maximize productivity while maintaining code quality.

Project Setup & Dependencies

ChallengeBest PracticeNotes
LLM-generated configs broke buildsUse official CLI toolsLetting Copilot generate requirements.txt or pyproject.toml caused dependency issues. Using poetry new or uv init avoided this entirely.
Version mismatchesInstall latest versions manuallyLLMs are trained on older versions. Installing the latest frameworks first and then asking Copilot to code on top prevented deprecated API usage.
Starting from scratch caused fragilityBegin with a working baselineAlways ran the app once manually before involving Copilot. This made it clear whether later failures were Copilot-induced.

Scope Management

ChallengeBest PracticeNotes
Unintended edits across filesExplicitly constrain scopeWithout constraints, Copilot modified unrelated files.

Prompt pattern that helped: “Only modify user_service.py. Do not change imports or behavior elsewhere.” or “Make the minimum number of changes required to accomplish this.”
Excess debug code left behindClean up after fixesModels often added print statements or logs during debugging. Running formatters and doing a final diff review helped catch leftover debug statements.
Unused code accumulatedRun linters and formatters after big changesLinters caught unused imports, variables, and helper functions that Copilot created but never removed. Example: ruff, black, mypy after refactors.

Technology Selection

ChallengeBest PracticeNotes
Hard to debug unfamiliar stacksUse languages/frameworks you knowVibe coding only worked smoothly in ecosystems I already understood. In unfamiliar stacks, debugging Copilot’s mistakes took longer than writing code manually.
Hallucinated APIsPrefer popular frameworksFastAPI and Flask consistently produced better suggestions than niche frameworks due to larger training data.
Large diffs and boilerplatePrefer less verbose technologiesPython worked better than Java; FastAPI better than Django REST Framework. Less boilerplate meant fewer opportunities for the model to mess up.
Reinvented common functionalityImport well-used librariesExample: using pydantic for validation instead of custom validators, or httpx instead of raw urllib. Lightweight, popular libraries were handled more reliably by Copilot.

Development Workflow

ChallengeBest PracticeNotes
Large diffs were riskyKeep changes small and iterativeSmaller prompts like “Refactor only this function” worked better than “Clean up this module.”
Code worked but looked wrongReview diffs, not just behaviorEven when tests passed, diff reviews caught duplicated logic and unnecessary abstractions introduced by Copilot.
Errors swallowedKeep errors explicitCopilot often wrapped logic in broad try/except blocks. Manual review ensured failures remained visible and actionable.
Risk of leaking secretsAudit logs carefullyDebug-heavy iterations sometimes logged request payloads or headers. Extra care was needed to remove sensitive logging before merge.

Testing & Documentation

ChallengeBest PracticeNotes
Silent regressionsRe-run tests frequentlyCopilot changes sometimes broke unrelated code paths. Running tests after every non-trivial change caught this early.
Tests locked wrong behaviorWrite tests after logic stabilizesGenerating tests too early made refactors harder. Waiting until behavior settled worked better.
Docs drifted quicklyUpdate docs lastAsking Copilot to update docs after code was final reduced stale or redundant documentation.

Maintainability & Best Practices

ChallengeBest PracticeNotes
Unclear generated codeNever merge what you don’t understandIf I couldn’t explain the code in my own words, I didn’t merge it. This rule prevented long-term technical debt.
Over-reliance on generationUse Copilot as a reviewerAsking Copilot “Review this diff and point out risks or unnecessary abstractions” often produced better results than generation alone.
Faster mistakesTreat speed as a riskCopilot increased speed significantly, but deliberate slowdowns at review time were essential to maintain architectural control.
This post is licensed under CC BY 4.0 by the author.