# Hostr App — Visual & Logic Audit

Full audit of the Hostr Flutter app, hostr_sdk, and models layer. Date: 2026-02-23. No code changes — findings and execution plan only.

# Table of Contents

Part A — Visual
- V1. Spacing & Padding
- V2. Typography & Font Sizes
- V3. Buttons — Types, Roles, Icons
- V4. Icon Sizes
- V5. Animations
- V6. Modals & Bottom Sheets
- V7. Image Loading & Placeholders
- V8. Loading Indicators
- V9. Translations / l10n
Part B — Logic
- L1. Error Handling
- L2. Stream & Listener Lifecycle
- L3. Caching, Batching & Use Cases
- L4. Load & Performance Hotspots
- L5. Nostr Protocol Future-Proofing
- L6. Test Infrastructure & Automation
Execution Plan

# Part A — Visual

# V1. Spacing & Padding

# Current State

A kDefaultPadding = 32 constant exists in app/lib/config/constants.dart, and a CustomPadding widget wraps Padding with multipliers of kDefaultPadding. This is a good foundation, but ~75% of spacing in the app bypasses it.

SizedBox hardcoded values found across presentation files:

Value (px)	Approx. uses	Equivalent `kDefaultPadding` fraction
4	~12	1/8
6	~5	(non-standard — 3/16)
8	~20	1/4
10	~3	(non-standard — 5/16)
12	~8	3/8
15	1	(non-standard)
16	~18	1/2
24	~9	3/4
32	~1	1×
40	~1	5/4

That's 10 distinct spacing values, of which 3 (6, 10, 15) don't align to any clean fraction of the base grid.

Raw EdgeInsets with hardcoded values appear ~65 times across the presentation layer, duplicating values that CustomPadding already provides (e.g. EdgeInsets.symmetric(horizontal: 32) literally equals kDefaultPadding).

# Recommendation

Adopt a 4px base grid spacing scale (industry standard, used by Material 3):

kSpace0  =  0
kSpace1  =  4   (kDefaultPadding / 8)
kSpace2  =  8   (kDefaultPadding / 4)
kSpace3  = 12   (kDefaultPadding * 3/8)
kSpace4  = 16   (kDefaultPadding / 2)
kSpace5  = 24   (kDefaultPadding * 3/4)
kSpace6  = 32   (kDefaultPadding)
kSpace7  = 48   (kDefaultPadding * 1.5)
kSpace8  = 64   (kDefaultPadding * 2)

Create a Spacer (or Gap) widget:

class Gap extends StatelessWidget {
  final double size;
  const Gap(this.size, {super.key});
  const Gap.xs({super.key}) : size = kSpace1;   //  4
  const Gap.sm({super.key}) : size = kSpace2;   //  8
  const Gap.md({super.key}) : size = kSpace4;   // 16
  const Gap.lg({super.key}) : size = kSpace6;   // 32
  const Gap.xl({super.key}) : size = kSpace7;   // 48

  @override
  Widget build(BuildContext context) => SizedBox(width: size, height: size);
}

Then replace all SizedBox(height: 16) with Gap.md(), etc. This eliminates magic numbers and makes spacing auditable via search.

# Files to Change (top offenders)

presentation/component/widgets/reservation/trade_header.dart — 8+ SizedBoxes
presentation/component/widgets/flow/payment/payment.dart — 6+ SizedBoxes
presentation/component/widgets/inbox/thread/thread_header.dart — 5+ SizedBoxes
presentation/component/widgets/listing/listing_list_item.dart — 4 SizedBoxes
presentation/screens/shared/listing/listing_view.dart — mixed SizedBox + EdgeInsets
presentation/screens/shared/profile/ — multiple files

# V2. Typography & Font Sizes

# Current State

The app uses Flutter's textTheme tokens in most places (good), but 7 distinct hardcoded fontSize values leak through:

Hardcoded size	Files	Should be
11	`trade_header.dart`	`labelSmall` (11)
12	`trade_header.dart`, `listing_list_item.dart`, `price_tag.dart`, `inbox_item.dart`	`bodySmall` (12)
14	`trade_header.dart`	`bodyMedium` (14)
16	`trade_header.dart`	`bodyLarge` (16) or `titleSmall` (14)
20	`price.dart`	`titleLarge` (22)
24	`review_list_item.dart` (star icon context)	`headlineSmall` (24)
28	`price_marker.dart`	`headlineMedium` (28)

# Best Practice — Type Scale

Material 3 defines exactly 15 text styles in 5 roles × 3 sizes. For a mobile accommodation app, you realistically need 5–7 distinct sizes to minimize cognitive load:

Role	Token	Typical size	Use in Hostr
Display	`displayMedium`	45	Splash screen, hero numbers
Headline	`headlineSmall`	24	Section headers on detail pages
Title	`titleLarge`	22	Screen/section titles, form labels
Title	`titleMedium`	16	Card titles, list item primary text
Body	`bodyMedium`	14	Descriptions, message text
Body	`bodySmall`	12	Captions, timestamps, secondary info
Label	`labelSmall`	11	Badges, chips, minimal annotations

Rule: Never use a raw fontSize: in widget code. Always use Theme.of(context).textTheme.bodySmall (with optional .copyWith(fontWeight: ...) for emphasis). This keeps the scale consistent and lets theme changes propagate everywhere.

# Files to Change

presentation/component/widgets/reservation/trade_header.dart — worst offender, 5 hardcoded sizes (11, 12, 14, 16)
presentation/component/widgets/listing/price_tag.dart — hardcoded 12
presentation/component/widgets/listing/price.dart — hardcoded 20
presentation/component/widgets/search/price_marker.dart — hardcoded 28
presentation/component/widgets/listing/listing_list_item.dart — hardcoded 12

# V3. Buttons — Types, Roles, Icons

# Current State

Four button types are used across the app:

Type	Count	Primary use
`FilledButton` / `.tonal` / `.icon`	~32	CTAs, confirmations
`ElevatedButton`	~7	Swap flows, some CTAs
`TextButton`	~8	Cancel, clear, secondary actions
`OutlinedButton`	~2	Tertiary/toggles
`IconButton`	~10	Navigation, copy, close

Problem: ElevatedButton and FilledButton are used interchangeably for primary CTAs. The swap flow screens (swap_in.dart, swap_out.dart) and dev.dart use ElevatedButton, while everything else uses FilledButton. This creates a visual inconsistency — ElevatedButton has elevation/shadow, FilledButton is flat.

# Best Practice — Button Hierarchy

Role	Widget	When to use	Icon?
Primary CTA	`FilledButton`	One per screen max. The main action (Pay, Reserve, Submit)	Only if icon adds clarity (e.g. send ✈, not generic ✓)
Secondary	`FilledButton.tonal`	Supporting actions (Use Escrow, Edit)	Optional
Tertiary	`TextButton`	Cancel, Clear, Skip — low-commitment actions	Rarely
Destructive	`FilledButton` + red `backgroundColor`	Delete, Refund, Block	Icon for emphasis (⚠)
Icon-only	`IconButton`	Toolbar actions, close, copy, navigation	Always

When to use icons on buttons:

✅ When the action has a universally recognized symbol (copy 📋, send ✈, close ✕)
✅ When used alongside other icon-only buttons in a row (toolbar)
❌ When the button already has clear text ("Pay" doesn't need a 💰 icon)
❌ When the icon is decorative rather than communicative

# Action Items

Replace all ElevatedButton with FilledButton across swap_in.dart, swap_out.dart, dev.dart
Define button presets in theme (or a AppButton wrapper) so primary/secondary/destructive styling is centralized
Audit icon usage on FilledButton.icon — ensure icons are communicative, not decorative

# V4. Icon Sizes

# Current State

10 distinct icon sizes are hardcoded across the app:

Size	Where	Role
12	Copy icon, comment	Detail actions
14	Key icon, chips	Inline indicators
16	6+ files	Default small icons
18	Amount input	Field icons
20	Multiple	Standard interactive
30	Search nav	Navigation
32	Detail view	Section icons
40	CircleAvatar fallback	Profile
48	Error icons	Status
80	Verified badge detail	Hero icon

The same icon (Icons.copy) appears at sizes 12, 16, and 18 in different files.

# Recommendation

Define an icon size scale mirroring the spacing scale:

const kIconXs  = 14.0;  // Chips, inline labels
const kIconSm  = 16.0;  // List item trailing, copy actions
const kIconMd  = 20.0;  // Standard interactive icons
const kIconLg  = 24.0;  // Navigation bar, section headers (Material default)
const kIconXl  = 32.0;  // Empty states, feature icons
const kIconHero = 48.0; // Error/success status, onboarding

Standardize: all copy icons → kIconSm, all nav icons → kIconLg, etc.

# V5. Animations

# Current State

Animation constants are well-defined in config/constants.dart:

const kAnimationDuration = Duration(milliseconds: 300);
const kAnimationCurve = Curves.easeInOut;
const kStaggerDelay = Duration(milliseconds: 60);

The AnimatedListItem widget correctly defaults to these. Most AnimatedSwitcher usages reference kAnimationDuration.

Deviations:

File	Duration	Curve	Issue
`listing_carousel.dart`	`300ms` (hardcoded)	`Curves.easeInOut` (hardcoded)	Should reference constants
`money_in_flight.dart`	`400ms`	`Curves.easeInOut`	Non-standard duration
`trade_timeline.dart`	`200ms`	`Curves.easeOut`	Different curve
`trade_header.dart` (shimmer)	`1500ms`	—	Correct for shimmer
`search_box.dart`	`1000ms`	—	Debounce, not animation (OK)

# Recommendation

Replace hardcoded Duration(milliseconds: 300) in listing_carousel.dart with kAnimationDuration
Decide: is 400ms intentional for money_in_flight.dart? If not, use kAnimationDuration. If yes, define kAnimationDurationSlow = Duration(milliseconds: 400)
Consider adding kAnimationDurationFast = Duration(milliseconds: 150) for micro-interactions (button press feedback, chip toggles)
Standardize on one curve family. easeInOut is correct for most transitions. easeOut is appropriate for elements entering the screen (quick start, gentle stop)

# Preloading & Perceived Performance

Can filter screens be preloaded? Yes — create the filter bottom sheet widget eagerly in the parent and show/hide it rather than constructing on tap. The SearchFilterCubit state should already be warm. In practice, if the bottom sheet construction is < 16ms (one frame), preloading isn't necessary. Profile first with DevTools timeline.

Preloading images / placeholders:

Currently BlossomImage shows CircularProgressIndicator while loading and Flutter's Placeholder() (a colored cross) on error — both are jarring
Add FadeInImage-style crossfade from a shimmer/skeleton placeholder to the loaded image
Consider adding precacheImage() calls for above-the-fold listing images when the list screen initializes
Implement CachedNetworkImage (or equivalent) to avoid re-downloading on every screen revisit

# V6. Modals & Bottom Sheets

# Current State

15 showModalBottomSheet callsites exist. A ModalBottomSheet wrapper widget provides consistent internal layout. But:

Issue	Affected files
`isScrollControlled` inconsistently set	7 of 15 don't set it
`useSafeArea` only set in 1 of 15 callsites	All except `listing_view.dart`
Several callsites bypass `ModalBottomSheet` and build custom layouts	`listing_view.dart`, `trade_header.dart`, `search_box.dart`
No shared `showAppModalBottomSheet()` helper	Each callsite configures independently

# Recommendation

Create a single entry point:

Future<T?> showAppModal<T>(BuildContext context, {
  required Widget child,
  bool isScrollControlled = true,
  bool useSafeArea = true,
  bool isDismissible = true,
}) => showModalBottomSheet<T>(
  context: context,
  isScrollControlled: isScrollControlled,
  useSafeArea: useSafeArea,
  isDismissible: isDismissible,
  builder: (_) => child,
);

Then replace all 15 callsites. This ensures consistent isScrollControlled and useSafeArea defaults.

# V7. Image Loading & Placeholders

# Current State

BlossomImage is the standard image widget — resolves SHA-256 hashes via Blossom server, falls back to Image.network
No disk caching — no CachedNetworkImage or equivalent anywhere in the codebase
Error state shows Flutter's Placeholder() widget (a colored diagonal cross) — not production-ready
Loading state shows a raw CircularProgressIndicator
Some files bypass BlossomImage and use raw Image.network (relay favicons, badge images)

# Recommendation

Add cached_network_image package — provides disk + memory caching, placeholder builders, and error builders out of the box
Replace Placeholder() with a branded error widget — e.g. a subtle grey rectangle with a broken-image icon
Replace loading CircularProgressIndicator with a shimmer skeleton matching the image's aspect ratio. This prevents layout shift when images load.
Wrap BlossomImage to use caching internally — so every BlossomImage benefits without changing callsites
Precache hero images — call precacheImage() for the first N listing images visible on the home/search screen

# V8. Loading Indicators

# Current State

27 CircularProgressIndicator instances across the app with 4 different strokeWidth values (default ~4.0, 4, 2, 1.5). Additionally:

Some use .adaptive(), others don't
A custom AsymptoticProgressBar exists (nice!) but is used in only one place
A private _ShimmerSurface in trade_header.dart is not reusable

# Recommendation

Create a shared AppLoadingIndicator widget with size presets:
- .small() — strokeWidth: 2, 16x16, for inline/list contexts
- .medium() — strokeWidth: 3, 24x24, default
- .large() — strokeWidth: 4, 48x48, for full-page loading
Extract _ShimmerSurface into a reusable ShimmerPlaceholder widget
Create ShimmerListItem, ShimmerCard skeleton widgets for list/card loading states (prevents layout shift)
Use CircularProgressIndicator.adaptive() everywhere for platform-native feel on iOS

# V9. Translations / l10n

# Current State

~51 strings use AppLocalizations.of(context)! (translated)
~70 strings are hardcoded English Text('...') literals (not translated)
Only English ARB file exists (app_en.arb with ~68 keys)
No pluralization rules, no parameterized messages beyond simple string interpolation

Hardcoded string hotspots:

dev.dart (debug screen — acceptable)
payment.dart, payment_method.dart — "Pay directly", "Use Escrow", "Copy", "Open wallet"
swap_in.dart, swap_out.dart — "Confirm", "Continue"
listing_view.dart — "Blocked Dates", "Block Dates", "Retry"
background_tasks.dart — all debug strings (acceptable)
edit_review.dart — "Save"
Various error messages — "Error:", "Unknown message type", "No wallet connected"

# Recommendation

Immediate: Extract all user-facing hardcoded strings to app_en.arb. Debug-only strings (dev.dart, background_tasks.dart) can stay hardcoded
Naming convention: Use camelCase keys matching the semantic role: payDirectly, useEscrow, blockedDates, retryButton, noWalletConnected
Error messages: Create parameterized ARB entries: "errorGeneric": "Something went wrong: {details}" with @errorGeneric metadata for placeholders
Plurals: Add plural rules for counts: "reviewCount": "{count, plural, =0{No reviews} =1{1 review} other{{count} reviews}}"
When ready for multi-language: add app_es.arb, app_fr.arb, etc. The Flutter l10n tooling will generate all delegates automatically

# Part B — Logic

# L1. Error Handling

# Current State — 5+ Inconsistent Patterns

Pattern	Cubits	Severity
`EntityCubitStateError(dynamic error)`	`EntityCubit`, `ProfileCubit`	⚠️ `dynamic` type
Status enum + `String? error` field	`ReservationCubit`, `ThreadReplyCubit`	OK
Sealed class with error subclass	`OnboardingCubit`, `AvailabilityCubit`	✅ Best
No error state at all	`ListCubit`, `CountCubit`	⛔ Critical
SDK operations with typed failure + rethrow	`PayOperation`, `SwapInOperation`, `EscrowFundOperation`	⚠️ Double-report

# Critical Issues

# 1. `ListCubit` has NO error handling

The next() method has a try/finally with no catch. The sync() subscription listener has no onError. This is the core data-fetching cubit — any relay failure crashes the stream silently.

# 2. `CountCubit.count()` has no `try/catch`

CountCubitStateError is defined but never emitted — dead code. Exceptions from nostrService.requests.count() propagate unhandled.

# 3. `PayOperation` double-reports errors

Each stage (resolve, finalize, complete) emits PayFailed AND rethrows the exception. If the caller also catches, the error surfaces twice. Additionally, complete() closes the cubit in a finally block — so PayFailed is emitted, then the cubit immediately closes, potentially causing a race condition in BlocBuilder.

# 4. Swap failures discard error details

The UI renders SwapInFailed / SwapOutFailed as hardcoded "Swap failed." strings, ignoring the error field that contains actionable information (e.g. "insufficient inbound liquidity", "invoice expired").

# 5. No global error boundary

runZonedGuarded only calls debugPrint. No crash reporting (Sentry, Crashlytics). No FlutterError.onError. No BlocObserver for cubit error monitoring.

# 6. Raw error strings shown to users

PayFailed and auth errors show e.toString() directly in the UI — exposing internal stack traces, exception class names, or cryptic relay errors to users.

# Recommendations

Standardize on sealed error states. Every cubit should use:

sealed class MyState { ... }
class MyError extends MyState { final String userMessage; final Object? cause; }

Add try/catch to ListCubit.next() — emit an error state, enable retry
Wire up CountCubitStateError — emit it in the catch block
Remove rethrow from PayOperation — emit PayFailed only, don't rethrow. Let UI handle via BlocListener
Don't close the cubit in PayOperation.complete() on failure — let the UI decide when to dismiss
Map errors to user-friendly messages — create an ErrorMapper that converts known exceptions to localized strings. Unknown errors → generic "Something went wrong. Please try again."
Add global BlocObserver for logging all cubit transitions and errors
Add Sentry/Crashlytics in runZonedGuarded and FlutterError.onError
Use BlocListener for transient error toasts — complement BlocBuilder error rendering with snackbar notifications for errors the user should know about but that don't replace the screen

# L2. Stream & Listener Lifecycle

# Current State — ✅ Generally Well-Managed

All cubits with subscriptions properly override close() and cancel subscriptions:

ThreadCubit — cancels all subs, closes participant cubits, deactivates trade
ListCubit — cancels 5 subscriptions, closes itemStream and nostr response
NwcConnectivityCubit — cancels connections subscription + per-cubit map
OnboardingCubit — cancels threads subscription, has reset()

All widgets with subscriptions cancel in dispose():

ListingListItemWidget — cancels reservation subscription, closes stream and cubits
SearchMapWidget — cancels list subscription
EscrowFundWidget — cancels selector subscription, closes operation and cubit

# Best Practice: `_subscriptions` list vs `takeUntil` vs individual fields

Approach	When to use
Individual fields (`_sub1`, `_sub2`)	When you have 1–3 named subscriptions with distinct lifecycle
`List<StreamSubscription> _subscriptions`	When subscriptions are dynamic or numerous (e.g. `ThreadCubit`)
`takeUntil(dispose$)` with RxDart	When using RxDart extensively and want declarative cleanup. Requires a `PublishSubject<void> _dispose$` that emits in `close()`
`CompositeSubscription` (RxDart)	Alternative to list — `composite.add(stream.listen(...))`, then `composite.dispose()`

The current codebase uses approach 1 and 2, which is fine. No change required unless you adopt RxDart more heavily.

# One Risk Found

ProfileCubit doesn't override close() and holds no subscriptions — but it's created dynamically by ThreadCubit which is responsible for closing it. This delegation pattern is correct but fragile: if any other code creates a ProfileCubit without closing it, it will leak. Consider documenting this ownership convention.

# L3. Caching, Batching & Use Cases

# CRUD UseCase Architecture

CrudUseCase<T> is the backbone. Key behaviors:

Feature	Status	Notes
In-flight query dedup	✅	`_inFlightQueries` keyed by serialized filter — identical concurrent queries share one stream
`getOne` batching	✅	500ms debounce window, combines filters, matches results back to callers
`findByTag` batching	✅	500ms debounce, merges tag values into one relay query
NDK cache (subscribe)	✅	Initial query uses `cacheRead: true`, live uses `cacheWrite: true`
NDK cache (one-shot query)	❌	`cacheRead: false` — every `getOne`/`list` hits the relay
Application-level cache	❌	No in-memory or disk cache for domain objects
Retry on failure	❌	No retry logic anywhere in the SDK
`count()` efficiency	❌	Fetches ALL events and calls `.length` — no relay-side COUNT support

# Caching Strategy Recommendations

Enable cacheRead: true for query() — the NDK's MemCacheManager already supports this; flipping the flag would give free in-memory caching for repeated queries (e.g. viewing the same listing twice)
Add a TTL-based invalidation — cached items should expire after N minutes. Stale-while-revalidate pattern: return cached immediately, refetch in background, update if changed
Profile-level caching — user profiles (kind: 0) are fetched repeatedly; add a ProfileCache keyed by pubkey with a 5-minute TTL
Listing image caching — adopt cached_network_image for disk-level image caching (see V7)
Relay-side COUNT — NIP-45 defines COUNT messages. If your relays support it, implement a proper count() that doesn't download all events. The NDK may already support this — check Ndk.requests.count()

# Batching Tuning

The 500ms debounce window is a trade-off:

Pro: Maximizes batching — more calls coalesce into fewer relay queries
Con: Adds 500ms latency to every first request in a batch window

Consider adaptive debounce: start at 50ms, extend to 500ms only when under high load (>5 pending requests). For UI-triggered fetches (user taps a listing), 50ms is imperceptible; for background syncs, 500ms is fine.

# L4. Load & Performance Hotspots

# 🔴 Critical

# 1. Subscription Explosion per Active Trade

Each ThreadTrade opens 3–4 concurrent Nostr subscriptions (all reservations, filtered reservations, reviews, zaps, escrow events). A user with 10 active trades = 30–50 concurrent relay subscriptions. Most relays cap at 10–20 concurrent subscriptions and will start closing older ones.

Fix: Multiplex trade subscriptions. Instead of per-trade subscriptions, open ONE subscription per kind that covers all active trades using combined filters, then dispatch events to the appropriate trade in-memory.

# 2. N+1 Query in `subscribeToMyReservations()`

For each reservation request message in the thread stream, a full getListingReservations() relay query fires. 20 messages = 20 queries.

Fix: Batch listing IDs from all messages, then fire a single findByTag query covering all listings at once.

# 3. `count()` Downloads Everything

CrudUseCase.count() fetches all matching events and calls .toList().length. For listings with hundreds of reservations, this is hugely wasteful.

Fix: Implement NIP-45 COUNT if relays support it, or cache counts locally with invalidation on new events.

# 🟡 Moderate

# 4. Thread Rebuild on Every `sync()`

Threads._rebuildThreadsFromMessages() clears all threads and re-processes every persisted message on each login. With 500+ messages, this is O(n) on startup.

Fix: Incremental thread updates — only process new messages since last sync timestamp.

# 5. Gift-wrap Fan-out

Each DM creates N+1 gift-wraps (one per recipient + self). A message to 3 participants = 4 broadcasts. This is inherent to NIP-17 and can't be avoided, but it's worth monitoring.

# 6. No Query-level Caching for `query()`

Since query() uses cacheRead: false, the same listing/metadata is fetched repeatedly when navigating between screens.

# L5. Nostr Protocol Future-Proofing

# Event Versioning — Currently None

There is zero versioning infrastructure:

No version field in any event content JSON
No version tag on events
fromJson methods have no fallback for missing fields
Adding a required field to ListingContent would crash parsing of every existing listing on relays

Impact scenario: You add cancellationPolicy to ListingContent. Every old listing on relays fails to parse → fromJson throws → parser rethrows → stream crashes.

# Recommendations

# 1. Add a Content Version Field

{ "v": 1, "title": "...", "description": "...", ... }

Add "v" to all custom event content. Start at 1. Increment on breaking changes.

# 2. Make `fromJson` Tolerant

Use json["field"] ?? defaultValue for all fields. Amenities.fromJSON already does this correctly — propagate the pattern to ListingContent, ReservationContent, ReviewContent, etc.

# 3. Add a Version Tag to Events

["v", "1"]

This allows relay-side filtering by version if needed, and lets clients ignore events they can't parse.

# 4. Implement a Migrator

When the app starts, query for own events with outdated versions, re-sign with updated content, and republish. Since Nostr events are immutable (signed), migration requires publishing new replaceable events (same d-tag, newer created_at).

class EventMigrator {
  Future<void> migrate(List<Nip01Event> myEvents) async {
    for (final event in myEvents) {
      final version = event.getTagValue('v') ?? '0';
      if (int.parse(version) < currentVersion) {
        final migrated = migrateContent(event, from: version, to: currentVersion);
        await broadcast(migrated); // replaceable: same d-tag overwrites
      }
    }
  }
}

# 5. Parser Error Resilience

The parser currently rethrows on malformed events, crashing the entire stream. Change to:

T? safeParser<T>(Nip01Event event) {
  try {
    return parser<T>(event);
  } catch (e, st) {
    logger.warning('Skipping malformed event ${event.id}: $e');
    return null;  // Skip, don't crash
  }
}

Then filter nulls from the stream. This is critical for forward-compatibility — a newer client might publish events that an older client can't parse.

# 6. Kind Number Issue

kNostrKindEscrowService = 40021 is in the ephemeral range (≥40000). Relays are not expected to store ephemeral events. Move to 30000–39999 range (parameterized replaceable).

# 7. Tag Collision Risk

Single-letter tags l, r, t, h may collide with future NIP standardizations (NIP-32 already uses l for labels). Options:

Formally propose these tag usages in a NIP
Switch to multi-character tags (e.g., listing, reservation)
Accept the collision risk and handle it in the parser by checking event.kind before interpreting tags

# Nostr Best Practices for Schema Evolution

Replaceable events are your friend — same pubkey + kind + d-tag naturally supersedes old versions
Content is opaque to relays — you can change JSON structure freely; relays only index tags
Tags are the public API — treat tag names and semantics as stable; content JSON as internal
Backwards-compatible additions — new optional fields with defaults are always safe
Breaking changes — require a new kind number or a version tag that old clients can filter out

# L6. Test Infrastructure & Automation

# Current State

Area	Status	Notes
SDK unit tests	✅ 11 files	Good coverage of core logic
SDK integration tests	✅ 3 files	Real Docker stack, escrow + swap flows
App unit tests	❌ Nearly empty	Only 1 smoke test (2+3=5) and 1 cubit test
App widget tests	❌ Empty	Directory scaffolded but no tests
App integration tests	⚠️ Minimal	1 screenshot test with 6 screens
Widgetbook	✅	Well-structured, multi-device frames
Shared test helpers	⚠️	`_Fake*` classes duplicated across SDK test files
Visual regression	❌	No golden test comparison

# Seed Data Architecture — Two Systems

System 1: Static Stubs (models/lib/stubs/) — Hardcoded mock data with 3 fixed keypairs. Used for Env.mock quick startup.

System 2: SeedPipeline (hostr_sdk/lib/seed/) — Sophisticated deterministic seed generation with configurable user count, host ratio, thread progression stages, per-user overrides. This is excellent but only used in SDK tests, not in app integration tests.

# What's Missing for Desired Workflows

# Flutter Drive to Specific Pages

The current integration test uses appRouter.navigate() which works but requires full app bootstrap. For surgical page testing:

Create a TestScenario class:

class TestScenario {
  final SeedPipelineConfig seedConfig;
  final List<PageRouteInfo> pages;  // auto_route page definitions
  final String name;
}

Define scenarios:

final hostWithBookings = TestScenario(
  name: 'host-with-bookings',
  seedConfig: SeedPipelineConfig(
    seed: 42,
    userCount: 5,
    hostRatio: 0.5,
    threadStageSpec: ThreadStageSpec.allCompleted(),
  ),
  pages: [HostBookingsRoute()],
);

Run per-scenario: flutter test integration_test/scenarios/host_with_bookings_test.dart

# App Store Screenshot Pipeline

Device matrix: Run against multiple simulators/emulators — define in a shell script:

DEVICES=("iPhone 16 Pro Max" "iPhone SE" "iPad Pro 12.9")
for device in "${DEVICES[@]}"; do
  flutter test integration_test/ -d "$device"
done

Locale matrix: Before each screenshot set, switch locale:
```
await tester.binding.setLocale('es', 'ES');
```
Framing: Use screenshots or device_frame package to add device bezels, then composite with Fastlane's frameit or a custom script.
CI integration: On tagged commits, run the screenshot pipeline and upload to an artifact store. Fastlane deliver can submit to App Store Connect directly.

# Shared Test Setup/Teardown

Extract _Fake* classes from SDK test files into hostr_sdk/test/helpers/:

test/helpers/
  fake_requests.dart
  fake_auth.dart
  fake_messaging.dart
  test_fixtures.dart

Create app-level test helpers in app/test/helpers/:

test/helpers/
  pump_app.dart         — wraps MaterialApp + providers + router
  scenario_runner.dart  — seeds data + navigates to page
  mock_providers.dart   — pre-configured BlocProviders for widget tests

Use SeedPipeline in app tests — bridge the SDK's seed system into the app's TestRequests:

final pipeline = SeedPipeline(config);
final events = await pipeline.build();
final requests = TestRequests();
requests.seedEvents(events);

# Mock Relay vs Real Relay Strategy

Test type	Data source	Speed	Reliability	Use for
Unit tests	`TestRequests` (in-memory)	⚡ Fast	100% deterministic	Business logic, cubits, parsers
Widget tests	`TestRequests` (in-memory)	⚡ Fast	100% deterministic	UI rendering, interaction
Integration (mock)	`MockRelay` (local WebSocket)	🔶 Medium	~99% deterministic	Full relay protocol, gift-wrap flow
Integration (real)	Docker stack (`./scripts/start_local.sh`)	🐌 Slow	~95% (depends on Docker)	Escrow, swaps, on-chain, end-to-end

Strategy:

Default to TestRequests for all app tests (fast, deterministic)
Use MockRelay only when testing relay-specific behavior (subscription management, auth, reconnection)
Use real Docker stack only for escrow/swap integration tests and manual QA
Tag tests: @Tags(['unit']), @Tags(['integration']), @Tags(['e2e']) — run subsets in CI

# Execution Plan

# Phase 1 — Foundation (Week 1)

No visible UI changes, but enables everything else.

#	Task	Priority	Estimated Effort
1.1	Define spacing scale constants (`kSpace0`–`kSpace8`) + `Gap` widget	🔴	2h
1.2	Define icon size constants (`kIconXs`–`kIconHero`)	🔴	30m
1.3	Add `kAnimationDurationFast` constant	🟢	15m
1.4	Create `showAppModal()` helper with standard defaults	🟠	1h
1.5	Create `AppLoadingIndicator` widget with `.small()/.medium()/.large()`	🟠	1h
1.6	Extract `_ShimmerSurface` into reusable `ShimmerPlaceholder`	🟠	1h
1.7	Create `AppErrorWidget` (replaces `Placeholder()` for image errors)	🟠	30m
1.8	Create `ErrorMapper` service (exceptions → user-friendly strings)	🔴	2h

# Phase 2 — Visual Consistency (Week 2)

Systematic sweep across all presentation files.

#	Task	Priority	Estimated Effort
2.1	Replace all hardcoded `SizedBox` spacing with `Gap.*`	🔴	4h
2.2	Replace all hardcoded `fontSize:` with `textTheme.*`	🔴	2h
2.3	Replace all `ElevatedButton` with `FilledButton` for primary CTAs	🟠	1h
2.4	Replace all hardcoded icon sizes with `kIcon*` constants	🟠	2h
2.5	Replace hardcoded animation durations/curves with constants	🟢	30m
2.6	Replace all `showModalBottomSheet` with `showAppModal()`	🟠	2h
2.7	Replace all `CircularProgressIndicator` with `AppLoadingIndicator`	🟠	1h
2.8	Add `cached_network_image`, wire into `BlossomImage`	🟠	2h
2.9	Add shimmer placeholders to image loading and list loading	🟠	3h
2.10	Extract remaining hardcoded strings to `app_en.arb`	🟡	2h

# Phase 3 — Error Handling (Week 3)

#	Task	Priority	Estimated Effort
3.1	Add `try/catch` + error state to `ListCubit`	⛔	2h
3.2	Wire `CountCubitStateError` emission	🔴	30m
3.3	Fix `PayOperation` — remove rethrow, don't close on failure	🔴	2h
3.4	Show actual error details in swap failure UI	🟠	1h
3.5	Standardize all cubit error states to sealed classes	🟠	4h
3.6	Add `BlocObserver` for global error/transition logging	🟠	1h
3.7	Integrate Sentry/Crashlytics in `runZonedGuarded` + `FlutterError.onError`	🟠	2h
3.8	Add `BlocListener`-based error snackbars for transient errors	🟡	3h

# Phase 4 — Performance & Protocol (Week 4)

#	Task	Priority	Estimated Effort
4.1	Make parser error-resilient (skip malformed events, don't crash stream)	⛔	2h
4.2	Add `"v": 1` to all custom event content JSON	🔴	3h
4.3	Make all `fromJson` tolerant with `?? default` fallbacks	🔴	3h
4.4	Enable `cacheRead: true` for `query()` in `CrudUseCase`	🟠	1h
4.5	Fix N+1 in `subscribeToMyReservations()` — batch listing IDs	🟠	3h
4.6	Multiplex trade subscriptions to reduce subscription count	🟠	6h
4.7	Fix `kNostrKindEscrowService` kind number (40021 → 3xxxx)	🟠	1h
4.8	Implement NIP-45 COUNT if relays support it	🟡	2h

# Phase 5 — Test Infrastructure (Week 5)

#	Task	Priority	Estimated Effort
5.1	Extract shared `_Fake*` classes to `hostr_sdk/test/helpers/`	🟠	2h
5.2	Create `app/test/helpers/pump_app.dart` for widget test bootstrapping	🟠	2h
5.3	Bridge `SeedPipeline` into app integration tests via `TestRequests`	🟠	3h
5.4	Create `TestScenario` framework for page-specific flutter drive tests	🟠	4h
5.5	Implement device × locale screenshot matrix script	🟡	3h
5.6	Add golden image tests for key widgets	🟡	4h
5.7	Add test tags (`unit`, `integration`, `e2e`) + CI configuration	🟡	2h
5.8	Build `EventMigrator` for versioned event migration on app start	🟡	4h

Total estimated effort: ~85 hours across 5 phases. Phases 1–2 are visual and can be done in parallel with Phase 3 (error handling). Phase 4 (performance/protocol) should come after Phase 3 since error resilience is a prerequisite. Phase 5 (testing) can begin any time but benefits from having Phases 1–4 complete.

# Hostr App — Visual & Logic Audit

# Table of Contents

# Part A — Visual

# V1. Spacing & Padding

# Current State

# Recommendation

# Files to Change (top offenders)

# V2. Typography & Font Sizes

# Current State

# Best Practice — Type Scale

# Files to Change

# V3. Buttons — Types, Roles, Icons

# Current State

# Best Practice — Button Hierarchy

# Action Items

# V4. Icon Sizes

# Current State

# Recommendation

# V5. Animations

# Current State

# Recommendation

# Preloading & Perceived Performance

# V6. Modals & Bottom Sheets

# Current State

# Recommendation

# V7. Image Loading & Placeholders

# Current State

# Recommendation

# V8. Loading Indicators

# Current State

# Recommendation

# V9. Translations / l10n

# Current State

# Recommendation

# Part B — Logic

# L1. Error Handling

# Current State — 5+ Inconsistent Patterns

# Critical Issues

# 1. ListCubit has NO error handling

# 2. CountCubit.count() has no try/catch

# 3. PayOperation double-reports errors

# 4. Swap failures discard error details

# 5. No global error boundary

# 6. Raw error strings shown to users

# Recommendations

# L2. Stream & Listener Lifecycle

# Current State — ✅ Generally Well-Managed

# Best Practice: _subscriptions list vs takeUntil vs individual fields

# One Risk Found

# L3. Caching, Batching & Use Cases

# CRUD UseCase Architecture

# Caching Strategy Recommendations

# Batching Tuning

# L4. Load & Performance Hotspots

# 🔴 Critical

# 1. Subscription Explosion per Active Trade

# 2. N+1 Query in subscribeToMyReservations()

# 3. count() Downloads Everything

# 🟡 Moderate

# 4. Thread Rebuild on Every sync()

# 5. Gift-wrap Fan-out

# 6. No Query-level Caching for query()

# L5. Nostr Protocol Future-Proofing

# Event Versioning — Currently None

# Recommendations

# 1. Add a Content Version Field

# 2. Make fromJson Tolerant

# 3. Add a Version Tag to Events

# 4. Implement a Migrator

# 5. Parser Error Resilience

# 6. Kind Number Issue

# 7. Tag Collision Risk

# Nostr Best Practices for Schema Evolution

# L6. Test Infrastructure & Automation

# Current State

# Seed Data Architecture — Two Systems

# What's Missing for Desired Workflows

# Flutter Drive to Specific Pages

# App Store Screenshot Pipeline

# Shared Test Setup/Teardown

# 1. `ListCubit` has NO error handling

# 2. `CountCubit.count()` has no `try/catch`

# 3. `PayOperation` double-reports errors

# Best Practice: `_subscriptions` list vs `takeUntil` vs individual fields

# 2. N+1 Query in `subscribeToMyReservations()`

# 3. `count()` Downloads Everything

# 4. Thread Rebuild on Every `sync()`

# 6. No Query-level Caching for `query()`

# 2. Make `fromJson` Tolerant