update links under the root README file

2026-04-30 04:43:54 +07:00 · 2026-01-04 20:53:07 +01:00
parent 60a6e52449
commit 5b81ba8200
23 changed files with 53 additions and 10 deletions
@@ -0,0 +1,53 @@
+# Kata 02: The Concurrent Map with Sharded Locks
+
+**Target Idioms:** Concurrency Safety, Map Sharding, `sync.RWMutex`, Avoiding `sync.Map` Pitfalls
+**Difficulty:** 🟡 Intermediate
+
+## 🧠 The "Why"
+Seasoned developers coming from Java might reach for `ConcurrentHashMap`-style solutions, while Pythonistas might think of GIL-protected dictionaries. In Go, you have three main options:
+1. **Naive sync.Mutex around a map** (bottlenecks under high concurrency)
+2. **sync.Map** (optimized for specific "append-only, read-heavy" cases, but opaque and often misused)
+3. **Sharded maps** (manual control, maximized throughput)
+
+The Go way is explicit control: if you know your access patterns, build a solution that fits. This kata forces you to understand *when* and *why* to choose sharding over sync.Map.
+
+## 🎯 The Scenario
+You're building a real-time **API Rate Limiter** that tracks request counts per user ID. The system handles 50k+ RPS with 95% reads (checking limits) and 5% writes (incrementing counters). A single mutex would serialize all operations-unacceptable. `sync.Map` might work but obscures memory usage and lacks type safety.
+
+## 🛠 The Challenge
+Implement `ShardedMap[K comparable, V any]` with configurable shard count that provides safe concurrent access.
+
+### 1. Functional Requirements
+* [ ] Type-safe generic implementation (Go 1.18+)
+* [ ] `Get(key K) (V, bool)` - returns value and existence flag
+* [ ] `Set(key K, value V)` - inserts or updates
+* [ ] `Delete(key K)` - removes key
+* [ ] `Keys() []K` - returns all keys (order doesn't matter)
+* [ ] Configurable number of shards at construction
+
+### 2. The "Idiomatic" Constraints (Pass/Fail Criteria)
+* [ ] **NO `sync.Map`**: Implement sharding manually with `[]map[K]V` and `[]sync.RWMutex`
+* [ ] **Smart Sharding**: Use `fnv64` hashing for key distribution (don't rely on Go's random map iteration)
+* [ ] **Read Optimization**: Use `RLock()` for `Get()` operations when safe
+* [ ] **Zero Allocation Hot-Path**: `Get()` and `Set()` must not allocate memory in the critical section (no string conversion, no boxing)
+* [ ] **Clean `Keys()`**: Implement without data races, even while concurrent writes occur
+
+## 🧪 Self-Correction (Test Yourself)
+1. **The Contention Test**:
+    - Run 8 goroutines doing only `Set()` operations with sequential keys
+    - With 1 shard: Should see heavy contention (use `go test -bench=. -cpuprofile` to verify)
+    - With 64 shards: Should see near-linear scaling
+
+2. **The Memory Test**:
+    - Store 1 million `int` keys with `interface{}` values
+    - **Fail Condition**: If your solution uses more than 50MB extra memory vs baseline map
+    - **Hint**: Avoid `string(key)` conversions; use type-safe hashing
+
+3. **The Race Test**:
+    - Run `go test -race` with concurrent read/write/delete operations
+    - Any race condition = automatic failure
+
+## 📚 Resources
+* [Go Maps Don't Appear to be O(1)](https://dave.cheney.net/2018/05/29/how-the-go-runtime-implements-maps-efficiently-without-generics)
+* [When to use sync.Map](https://dave.cheney.net/2017/07/30/should-i-use-sync-map)
+* [Practical Sharded Maps](https://github.com/orcaman/concurrent-map)
@@ -0,0 +1,58 @@
+# Kata 04: The Zero-Allocation JSON Parser
+
+**Target Idioms:** Performance Optimization, `json.RawMessage`, Streaming Parsers, Buffer Reuse
+**Difficulty:** 🟡 Intermediate
+
+## 🧠 The "Why**
+Developers from dynamic languages often parse JSON by unmarshaling entire documents into `map[string]interface{}` or generic structs. In high-throughput Go services, this creates:
+1. Massive memory churn (GC pressure)
+2. Unnecessary allocations for unused fields
+3. Lost type safety
+
+The Go way: **Parse only what you need, reuse everything**. This kata teaches you to treat JSON as a stream, not a document.
+
+## 🎯 The Scenario
+You're processing **10MB/s of IoT sensor data** with JSON like:
+```json
+{"sensor_id": "temp-1", "timestamp": 1234567890, "readings": [22.1, 22.3, 22.0], "metadata": {...}}
+```
+You only need `sensor_id` and the first reading value. Traditional unmarshal would allocate for all fields and the entire readings array.
+
+## 🛠 The Challenge
+Implement `SensorParser` that extracts specific fields without full unmarshaling.
+
+### 1. Functional Requirements
+* [ ] Parse `sensor_id` (string) and first `readings` value (float64) from JSON stream
+* [ ] Process `io.Reader` input (could be HTTP body, file, or network stream)
+* [ ] Handle malformed JSON gracefully (skip bad records, continue parsing)
+* [ ] Benchmark under 100ns per object and 0 allocations per parse
+
+### 2. The "Idiomatic" Constraints (Pass/Fail Criteria)
+* [ ] **NO `encoding/json.Unmarshal`**: Use `json.Decoder` with `Token()` streaming
+* [ ] **Reuse Buffers**: Use `sync.Pool` for `bytes.Buffer` or `json.Decoder`
+* [ ] **Early Exit**: Stop parsing once required fields are found
+* [ ] **Type Safety**: Return concrete struct `SensorData{sensorID string, value float64}`, not `interface{}`
+* [ ] **Memory Limit**: Process arbitrarily large streams in constant memory (<1MB heap)
+
+## 🧪 Self-Correction (Test Yourself)
+1. **The Allocation Test**:
+   ```go
+   go test -bench=. -benchmem -count=5
+   ```
+   **Pass**: `allocs/op` = 0 for parsing loop
+   **Fail**: Any allocations in hot path
+
+2. **The Stream Test**:
+    - Pipe 1GB of JSON through your parser (mock with repeating data)
+    - **Pass**: Memory usage flatlines after warm-up
+    - **Fail**: Memory grows linearly with input size
+
+3. **The Corruption Test**:
+    - Input: `{"sensor_id": "a"} {"bad json here` (malformed second object)
+    - **Pass**: Returns first object, logs/skips second, doesn't panic
+    - **Fail**: Parser crashes or stops processing entirely
+
+## 📚 Resources
+* [Go JSON Stream Parsing](https://ahmet.im/blog/golang-json-stream-parse/)
+* [json.RawMessage Tutorial](https://www.sohamkamani.com/golang/json/#raw-messages)
+* [Advanced JSON Techniques](https://eli.thegreenplace.net/2019/go-json-cookbook/)
@@ -0,0 +1,36 @@
+# Kata 11: The NDJSON Reader That Survives Long Lines
+**Target Idioms:** Streaming I/O (`io.Reader`), `bufio.Reader` vs `Scanner`, Handling `ErrBufferFull`, Low Allocation  
+**Difficulty:** 🟡 Intermediate
+
+## 🧠 The "Why"
+Seasoned devs reach for `bufio.Scanner` and it “works”… until production sends a line > 64K and you get:
+`bufio.Scanner: token too long`.
+
+This kata forces you to implement a streaming reader that can handle **arbitrarily large lines** without falling over.
+
+## 🎯 The Scenario
+You ingest NDJSON logs from stdin or a file. Lines can be huge (hundreds of KB). You must process line-by-line.
+
+## 🛠 The Challenge
+Implement:
+- `func ReadNDJSON(ctx context.Context, r io.Reader, handle func([]byte) error) error`
+
+### 1. Functional Requirements
+- [ ] Call `handle(line)` for each line (without the trailing newline).
+- [ ] Stop immediately on `handle` error.
+- [ ] Stop immediately on `ctx.Done()`.
+
+### 2. The "Idiomatic" Constraints (Pass/Fail Criteria)
+- [ ] **Must NOT** rely on default `bufio.Scanner` behavior.
+- [ ] **Must** use `bufio.Reader` and correctly handle `ReadSlice('\n')` returning `ErrBufferFull`.
+- [ ] **Must** avoid per-line allocations where possible (reuse buffers).
+- [ ] Wrap errors with line number context using `%w`.
+
+## 🧪 Self-Correction (Test Yourself)
+- **If a 200KB line crashes with “token too long”:** you failed.
+- **If cancellation doesn’t stop promptly:** you failed.
+- **If you allocate a new buffer each line:** you failed the low-allocation goal.
+
+## 📚 Resources
+- https://pkg.go.dev/bufio
+- https://pkg.go.dev/io
@@ -0,0 +1,42 @@
+# Kata 12: The sync.Pool Buffer Middleware
+**Target Idioms:** `sync.Pool`, Avoiding GC Pressure, `bytes.Buffer` Reset, Benchmarks (`-benchmem`)  
+**Difficulty:** 🔴 Advanced
+
+## 🧠 The "Why"
+In Go, performance regressions often come from allocation/GC churn, not “slow CPU”.
+People use `sync.Pool` incorrectly:
+- pooling long-lived objects (wrong),
+- forgetting to reset buffers (data leak),
+- storing huge buffers back into the pool (memory bloat).
+
+This kata is about **safe pooling** for high-throughput handlers.
+
+## 🎯 The Scenario
+You’re writing an HTTP middleware that:
+- reads up to 16KB of request body for audit logging
+- must not allocate per-request in the hot path
+
+## 🛠 The Challenge
+Implement a middleware:
+- `func AuditBody(max int, next http.Handler) http.Handler`
+
+### 1. Functional Requirements
+- [ ] Read up to `max` bytes of request body (do not consume beyond `max`).
+- [ ] Log the captured bytes with `slog` fields.
+- [ ] Pass the request downstream intact (body still readable).
+
+### 2. The "Idiomatic" Constraints (Pass/Fail Criteria)
+- [ ] **Must** use `sync.Pool` to reuse buffers.
+- [ ] **Must** `Reset()`/clear buffers before putting back.
+- [ ] **Must** bound memory: never keep buffers larger than `max` in the pool.
+- [ ] Provide a benchmark showing reduced allocations (`go test -bench . -benchmem`).
+
+## 🧪 Self-Correction (Test Yourself)
+- **If a request leaks previous request content:** you failed (no reset).
+- **If allocations are ~O(requests):** you failed pooling.
+- **If buffers grow unbounded and stay in pool:** you failed memory bounds.
+
+## 📚 Resources
+- https://pkg.go.dev/sync
+- https://go.dev/doc/gc-guide
+- https://go.dev/blog/pprof