The Size System
Every t-shirt size maps to a specific time range:| Size | Time Range | Average | When to Use |
|---|---|---|---|
| XS | 0-30 mins | 15 mins | Trivial changes: fixing typos, updating text, tiny CSS tweaks |
| S | 1-2 hours | 1.5 hours | Small work: simple component updates, straightforward bug fixes |
| M | 3-6 hours | 4.5 hours | Standard features: moderate complexity, typical task size |
| L | 8-16 hours | 12 hours | Complex features: multiple components, significant logic |
| XL | 24-40 hours | 32 hours | Major features: substantial work spanning multiple days |
| XXL | 40-120 hours | 80 hours | Very large deliverables: multi-week projects, full modules |
Averages are rounded to the nearest 15 minutes for clean scheduling. This ensures time allocations work smoothly with standard calendar increments.
Not Required
There’s also a Not Required option for requirements that don’t apply to a particular scope (e.g., design not needed for a backend-only task).Why Ranges Instead of Fixed Hours?
Ranges acknowledge the inherent uncertainty in software estimation:- Minimum (Best Case)
- Average (Expected Case)
- Maximum (Worst Case)
Everything goes smoothly
- No unexpected complications
- Clear requirements
- Existing patterns to follow
- Minimal QA issues
What’s Included in the Size?
T-shirt sizes represent the total time needed to complete the deliverable, including:Core Work (80%)
For Development:
- Writing code
- Implementing features
- Building functionality
- Creating mockups
- Designing interfaces
- Visual work
QA & Feedback (20%)
For Development:
- Internal QA review time
- Fixing bugs found in QA
- External QA feedback fixes
- Client feedback rounds
- Revisions based on feedback
- Design iteration cycles
The 80/20 Split in Practice
Internally, CharleOS splits the average time for scheduling purposes: Example: M size development (average = 270 minutes)- QA budget: 60 minutes (20% = 54 mins, rounded UP to 60)
- Development: 210 minutes (270 - 60 = 210)
- Core work → Assigned developer
- QA fixes → Same developer, but scheduled separately
- Internal QA finds 3 bugs → Uses part of the 20%
- Client UAT finds 2 more → Uses more of the 20%
- Once the 20% is exhausted, additional time is overage (non-billable)
The 80/20 split is for internal scheduling only. Clients see the total t-shirt size, not the breakdown.
How to Choose a Size
Selecting the right t-shirt size requires considering multiple factors:1. Complexity
Low Complexity → Smaller Sizes
Low Complexity → Smaller Sizes
- Work is straightforward
- Clear requirements
- Existing patterns to follow
- Minimal logic or edge cases
- Update footer links (XS)
- Add a new field to a form (S)
Moderate Complexity → Medium Sizes
Moderate Complexity → Medium Sizes
- Some technical challenge
- Multiple components involved
- Standard business logic
- Normal testing needs
- Add product filtering (M)
- Build contact form with validation (M)
High Complexity → Larger Sizes
High Complexity → Larger Sizes
- Significant technical challenge
- Multiple interconnected components
- Complex business rules
- Extensive testing required
- Multi-step checkout flow (L or XL)
- Real-time collaboration features (XL or XXL)
2. Unknowns and Discovery
The more uncertainty, the larger the size:| Unknown Factor | Impact | Example |
|---|---|---|
| Clear requirements | No change | ”Make button blue” |
| Some ambiguity | Size up by 1 | ”Improve navigation UX” |
| Significant unknowns | Size up by 2, or do discovery first | ”Integrate with third-party API (unclear docs)“ |
| Complete unknown | Discovery task first | ”Research best approach for real-time sync” |
When unknowns are high, consider a separate discovery task (usually S or M) before estimating the implementation.
3. Dependencies
Dependencies increase risk and coordination overhead:- No Dependencies
- Few Dependencies
- Many Dependencies
Self-contained work
- No reliance on other work
- Can be completed independently
- Minimal coordination needed
4. Similar Past Work
Compare to completed tasks with similar scope:1
Find Similar Tasks
Search for tasks with comparable requirements, complexity, and tech stack.
2
Review Actual Time
Check how long they actually took (not just the estimate). Look at the billable time and any overage.
3
Identify Patterns
- Did they consistently overrun? → Size up
- Were they efficient? → Similar size is safe
- Did specific clients or codebases cause issues? → Account for that
4
Adjust for Differences
Is this task simpler or more complex than the reference? Adjust size accordingly.
Sizing Guidelines by Type
Design Work
| Size | Typical Work | Includes |
|---|---|---|
| XS | Icon design, minor style tweaks | Quick mockup, minimal feedback |
| S | Single page mockup | 1-2 feedback rounds |
| M | Multi-page flow (3-5 screens) | Full design + client revisions |
| L | Complete section redesign (10+ screens) | Multiple iterations, detailed feedback |
| XL | Full site redesign | Extensive revisions, stakeholder alignment |
Development Work
| Size | Typical Work | Includes |
|---|---|---|
| XS | Typo fix, config change | Quick test, deploy |
| S | Simple CRUD endpoint, basic component | Unit tests, QA review |
| M | Standard feature with UI + logic | Full testing, bug fixes |
| L | Complex feature with multiple components | Integration testing, multiple QA rounds |
| XL | Major feature or module | Comprehensive testing, extensive QA |
Common Sizing Mistakes
Mistake 1: Not Accounting for QA/Feedback
- ❌ Wrong
- ✅ Correct
Thinking: “The coding will take 3 hours, so I’ll size it S (1-2 hours)”Problem: Forgot to account for QA fixes and testing time.
Mistake 2: Ignoring Client or Codebase Factors
- ❌ Wrong
- ✅ Correct
Thinking: “This is a standard feature, always size it M”Problem: Didn’t account for this client’s legacy Vue 2 codebase being painful to work in.
Mistake 3: Using XXL as a Catch-All
- ❌ Wrong
- ✅ Correct
Thinking: “This is big and complex, so XXL”Problem: XXL becomes a black box—hard to schedule, hard to track progress, high risk.
Tips for Better Sizing
When in Doubt, Size Up
When in Doubt, Size Up
If you’re torn between S and M, choose M. It’s better to finish early (creating banked time) than to consistently overrun. Clients prefer positive surprises.
XS Should Be Rare
XS Should Be Rare
Reserve XS for truly trivial changes. If it requires any thought or testing, it’s probably S. Most work is S or larger.
Break Up XXL
Break Up XXL
XXL should be rare. If a requirement is XXL, consider splitting it into multiple smaller deliverables. This reduces risk and improves scheduling flexibility.
Use Historical Data
Use Historical Data
Review past tasks regularly to calibrate your estimates. Which sizes were accurate? Which consistently overran? Learn from patterns.
Communicate Assumptions
Communicate Assumptions
When estimating, note your assumptions:
- “Assumes API docs are accurate”
- “Includes 2 design feedback rounds”
- “Based on similar product filter implementation”
How Sizes Affect Billing
T-shirt sizes feed directly into the value-based billing formula:| Scenario | Actual Time | Billable Time | Outcome |
|---|---|---|---|
| Efficient | 3 hours | 4.5 hours (average) | 1.5 hrs banked (profit) |
| On Target | 5 hours | 5 hours (actual) | No margin, fair |
| Overrun | 7 hours | 6 hours (maximum) | 1 hr overage (absorbed) |
- Minimum is never billed (you always bill at least the average if you finish early)
- Average is billed when you finish faster than average
- Maximum is the cap—client never pays more, even if you overrun
Using T-shirt Sizes in CharleOS
In Quotes
When creating quotes, each requirement block is assigned a t-shirt size:- Open the quote
- Navigate to the requirement
- Select a size from the dropdown
- Optionally add sizing notes (e.g., “Assumes API is RESTful”)
- Save
- Minimum total time
- Average total time (what client is quoted)
- Maximum total time (the cap)
In Tasks
When quotes convert to tasks, the t-shirt sizes carry over:- Task inherits the estimate
- Subtasks are created with the 80/20 split applied
- Billing cap is set to the maximum
- Actual time is tracked against the estimate
Viewing Size Info
Hover over any t-shirt size badge in the UI to see:- Time range (e.g., “3-6 hours”)
- Average (e.g., “4.5 hours”)
- What’s included (core work + QA/feedback)