Core metrics

Task completion

5 / 5

100%

Execution duration

10m 17s

Created 2026-02-15 00:18:49.813Z

Token total

1,919,030

Input 1,904,609 · Output 14,421

Tool calls

44

Retries 0%

Tool success rate

95%

0 permission events

Skill overview

Source: N/A

Evaluate capability coverage for rust-best-practices

verified

Evidence refs: marker-1, marker-2, marker-3

Can execute: Refactor Clone-Heavy Code

verified

Evidence refs: marker-14, marker-17

Can execute: Implement Error Hierarchy with thiserror

verified

Evidence refs: marker-18, marker-22

Can execute: Type State Pattern Implementation

verified

Evidence refs: marker-23, marker-26

Can execute: Performance Optimization with Benchmarking

verified

Evidence refs: marker-27, marker-31

Can execute: Comprehensive Code Review

verified

Evidence refs: marker-32, marker-36

Constraint coverage: Must use Rust 1.70+ toolchain

verified

Evidence refs: marker-1, marker-2

Constraint coverage: All code must pass `cargo clippy --all-targets --all-features --locked -- -D warnings`

verified

Evidence refs: marker-1, marker-2

Constraint coverage: No `unwrap()` or `expect()` calls outside test code

verified

Evidence refs: marker-1, marker-2

Benchmark taskset (TASKSET.md)

Taskset - rust-best-practices

Goal

Evaluate Claude Code's ability to apply idiomatic Rust patterns and Apollo GraphQL's best practices when writing, reviewing, and refactoring Rust code. The agent must demonstrate understanding of ownership, error handling, performance optimization, linting, and type safety principles.

Constraints

Must use Rust 1.70+ toolchain
All code must pass cargo clippy --all-targets --all-features --locked -- -D warnings
No unwrap() or expect() calls outside test code
Must use appropriate error handling (Result<T, E> with thiserror or anyhow)
Performance-critical code must be benchmarked with --release flag
Must apply correct ownership patterns (borrowing vs cloning)
All public APIs require documentation comments (///)
Type state pattern required when encoding compile-time state safety

Tasks

Task 1: Refactor Clone-Heavy Code

Given a Rust module with excessive .clone() calls and suboptimal ownership patterns, refactor to use borrowing where appropriate. Identify which types should implement Copy, convert function signatures to accept &str/&[T] instead of owned types, and eliminate redundant clones detected by Clippy.

Files: src/parser.rs (contains string parsing with unnecessary clones)

Expected: Refactored code passing clippy::perf checks, documented rationale for remaining clones using #[expect(clippy::redundant_clone)] where justified.

Task 2: Implement Error Hierarchy with thiserror

Create a new library module with a proper error hierarchy using thiserror. Define at least three error variants with context, implement From conversions for underlying errors, and demonstrate proper error propagation using ? operator across function boundaries.

Files: Create src/errors.rs and src/validator.rs using the error types

Expected: Zero panics, all errors wrapped with context, Result<T, E> returns throughout, passes clippy error handling lints.

Task 3: Type State Pattern Implementation

Design and implement a connection state machine using the type state pattern. Must have at least three states (e.g., Disconnected, Connecting, Connected) where invalid operations are impossible at compile time. Include PhantomData usage and state transitions.

Files: Create src/connection.rs

Expected: Compile-time enforcement of state transitions, doc comments explaining design, example usage in module docs showing prevented invalid operations.

Task 4: Performance Optimization with Benchmarking

Given a module with iterator chains that prematurely call .collect(), refactor to eliminate intermediate allocations. Add criterion benchmarks comparing before/after performance, and document findings. Must apply zero-cost abstraction principles.

Files: src/processor.rs (contains nested .collect() calls), create benches/processor_bench.rs

Expected: Benchmark results showing measurable improvement, iterator fusion applied, no unnecessary heap allocations, passes clippy::perf.

Task 5: Comprehensive Code Review

Review a provided Rust module against all Apollo best practices chapters. Identify violations in: borrowing patterns, error handling, linting issues, missing documentation, suboptimal generics usage, and performance anti-patterns. Provide actionable feedback with specific chapter references and code examples.

Files: src/review_target.rs (intentionally contains multiple violations)

Expected: Markdown report listing each violation by category, severity level, chapter reference, and suggested fix with code snippet. Must catch at least 8 distinct issue types.

Acceptance Checklist

All generated code compiles without warnings using cargo build --all-targets
cargo clippy --all-targets --all-features --locked -- -D warnings exits successfully
No unwrap(), expect(), or panic!() in production code paths (tests excluded)
All public functions and types have /// documentation comments
Error handling uses Result<T, E> with appropriate error types (thiserror or anyhow)
Borrowing preferred over cloning with documented justification for any remaining clones
Type state pattern correctly uses PhantomData and prevents invalid states at compile time
Benchmarks exist and demonstrate measurable performance improvements
Code review identifies at least 8 distinct Apollo best practice violations with chapter references
All TODO comments include issue references (e.g., // TODO(#42): ...)
Test names follow convention: fn operation_should_behavior_when_condition()
Iterator chains avoid unnecessary .collect() calls between operations

Required Output Schema

{
  "task_completion": {
    "task_1_refactor": {
      "files_modified": ["src/parser.rs"],
      "clones_eliminated": 0,
      "borrowing_conversions": 0,
      "clippy_perf_pass": false
    },
    "task_2_errors": {
      "files_created": ["src/errors.rs", "src/validator.rs"],
      "error_variants_defined": 0,
      "panic_count": 0,
      "result_propagation": false
    },
    "task_3_typestate": {
      "files_created": ["src/connection.rs"],
      "states_defined": 0,
      "compile_time_safety": false,
      "phantom_data_used": false
    },
    "task_4_performance": {
      "files_modified": ["src/processor.rs"],
      "files_created": ["benches/processor_bench.rs"],
      "intermediate_collections_removed": 0,
      "performance_improvement_percent": 0.0,
      "benchmark_exists": false
    },
    "task_5_review": {
      "files_reviewed": ["src/review_target.rs"],
      "violations_found": 0,
      "categories_covered": [],
      "chapter_references": 0,
      "report_generated": false
    }
  },
  "acceptance_criteria": {
    "compiles_without_warnings": false,
    "clippy_passes": false,
    "no_unwrap_in_prod": false,
    "all_public_documented": false,
    "proper_error_handling": false,
    "borrowing_optimized": false,
    "typestate_correct": false,
    "benchmarks_demonstrate_improvement": false,
    "review_comprehensive": false,
    "todos_have_issues": false,
    "test_naming_convention": false,
    "iterator_optimization": false
  },
  "clippy_report": {
    "warnings": 0,
    "errors": 0,
    "perf_issues": 0,
    "correctness_issues": 0
  },
  "documentation_coverage": {
    "public_items": 0,
    "documented_items": 0,
    "coverage_percent": 0.0
  }
}

Task execution overview

Refactor Clone-Heavy Codecompleted

Start: 2026-02-15 00:19:42.506Z · End: 2026-02-15 00:20:16.917Z

Duration: 34s · Tokens: N/A

Implement Error Hierarchy with thiserrorcompleted

Start: 2026-02-15 00:20:16.917Z · End: 2026-02-15 00:21:12.818Z

Duration: 56s · Tokens: N/A

Type State Pattern Implementationcompleted

Start: 2026-02-15 00:21:12.818Z · End: 2026-02-15 00:21:54.321Z

Duration: 42s · Tokens: N/A

Performance Optimization with Benchmarkingcompleted

Start: 2026-02-15 00:21:54.321Z · End: 2026-02-15 00:22:44.420Z

Duration: 50s · Tokens: N/A

Comprehensive Code Reviewcompleted

Start: 2026-02-15 00:22:44.420Z · End: 2026-02-15 00:23:56.168Z

Duration: 1m 12s · Tokens: N/A

Execution timeline

Playback

Timeline selection jumps playback to the corresponding timestamp.