Always Think Integration
Integration is where theory meets reality. Your beautifully designed subsystems must physically connect, electrically communicate, and logically cooperate. Most projects fail not because individual components are bad, but because they don't work together.
The key to successful integration: incremental assembly, relentless testing, and obsessive documentation of interfaces.
Design for Testability: Observability & Test Points
Where to Add Test Points
Power Rails
Every major voltage rail needs measurement access:
- Voltage test points (before and after regulators)
- Current sense resistors or Hall sensors
- Load switches to isolate subsystems
- Indicator LEDs for quick visual checks
Data Buses
Communication interfaces need probe access:
- SPI/I2C/UART: expose CLK, MOSI, MISO, CS lines
- CAN bus: high-side and low-side differential pairs
- Ethernet: TX/RX pairs accessible for scope probing
- Logic analyzer headers for multi-signal capture
RF Chains
Radio paths need multiple measurement points:
- RF test points after PA, before antenna
- Directional couplers for TX power monitoring
- RSSI or AGC voltage monitoring for RX
- LNA bypass for troubleshooting RX issues
Debug Interfaces
Every processor needs debug access:
- JTAG or SWD exposed (don't bury it)
- UART console for printf debugging
- Dedicated debug connector (not shared pins)
- Boot mode selection (normal vs debug/recovery)
What to Measure
| Subsystem | Key Measurements | Why It Matters |
|---|---|---|
| Power | Voltage, current, ripple, efficiency | Detect brownouts, shorts, thermal issues |
| Communications | Signal integrity, bit error rate, latency | Verify protocol timing, catch data corruption |
| RF | TX power, RX sensitivity, SNR, frequency error | Ensure link budget margins, diagnose range issues |
| Timing | Clock frequency, jitter, phase noise | Real-time systems fail with timing violations |
| Thermal | Component temperatures, gradient mapping | Prevent thermal runaway, validate cooling |
Lock Interfaces First, Internals Second
What to Define in ICDs
Electrical Interfaces
- Voltage levels (3.3V, 5V, differential)
- Current draw (max, average, inrush)
- Connector type and pinout
- Impedance matching (for high-speed)
- Grounding and shielding strategy
- ESD protection requirements
Mechanical Interfaces
- Mounting hole patterns and spacing
- Envelope constraints (max dimensions)
- Connector orientation and access
- Cable routing and bend radius
- Thermal interface (heatsink contact)
- Keep-out zones for other subsystems
Data Interfaces
- Protocol (UART, SPI, CAN, Ethernet, USB)
- Baud rate or clock frequency
- Message formats (packet structure)
- Timing requirements (setup, hold, latency)
- Error handling (CRC, retry, timeout)
- Flow control mechanism
Power Sequencing
- Startup order (which rails first)
- Delay requirements between rails
- Brownout behavior (what happens below threshold)
- Shutdown sequence (graceful vs emergency)
- Hot-swap capability (if required)
- Fault isolation (prevent cascade failures)
ICDs are living documents. Track every change, get sign-off from affected teams, and maintain revision history. A single "quick fix" to an interface without updating the ICD causes integration disasters. Treat ICD changes like code changes: review, approve, document.
Incremental Integration Strategy
Integration Steps
- Power-On Test: Before anything else, verify power rails come up cleanly. Measure voltages, check for shorts, verify sequencing. No point testing data if power is broken.
- One Interface at a Time: Add subsystems sequentially. Computer boots → add sensor → verify I2C communication → add actuator → verify SPI. Isolate failures to specific interfaces.
- Known-Good Baselines: After each successful integration step, save that configuration. If the next step breaks, you can roll back to the last working state. Version control for hardware integration.
- Stub Out Subsystems: Don't wait for final hardware. Use simulators, dev boards, or dummy loads to test interfaces early. Replace stubs with real hardware incrementally.
- Integration Checkpoints: Define clear pass/fail criteria at each step. Document what works, what doesn't, and what workarounds were needed. Lessons learned feed forward.
Building Without Hardware in Hand
Development Timeline (Without Final Hardware)
Weeks 1-4: Simulations & Models
Build software models before touching hardware. Mathematical models for RF link budgets, thermal simulations, power analysis. Software-in-the-loop (SIL) for algorithm development. Remove algorithmic uncertainty before hardware complexity.
Deliverable: Proven algorithms, validated assumptions, identified risks
Weeks 5-10: Development Boards / COTS
Use off-the-shelf hardware to validate interfaces early. Arduino, Raspberry Pi, STM32 dev boards—whatever's close enough. Prove software runs on real hardware with interrupts and timing constraints. Test interface protocols (SPI, I2C, UART) with actual devices.
Identify gotchas: Race conditions, buffer overflows, timing violations, real-world noise
Weeks 11-16: Breadboard / Proto PCBs
Build functional approximation with target components. Use actual chips, connectors, power supplies you'll fly. Electrical verification: signal integrity, noise, power consumption. Integration testing between subsystems. Iterate quickly—breadboards are disposable.
Find problems now: Before committing to expensive PCB fabrication
Weeks 17-24: Engineering Model
First real PCB, actual form factor, representative hardware. Not flight-ready, but electrically and mechanically similar. Full system integration: all subsystems talking. Environmental testing if available (thermal, vibration).
Purpose: Find design flaws before flight hardware commits
Week 25+: Flight Hardware
Final hardware arrives—but your software already works. Minimal surprises because you've tested approximations. Focus on qualification testing, not basic functionality. Timeline risk reduced: integration is incremental, not big-bang.
Result: Confidence from proven performance on similar hardware
Technology Choices: Popular vs Appropriate
Common Technology Trade-Offs
Message Queues: Kafka vs Alternatives
Kafka is great when:
- High throughput (millions of messages/sec)
- Distributed system with multiple consumers
- Need message replay and persistence
- Have ops team to manage cluster
Simpler alternatives when:
- RabbitMQ: Easier ops, good routing, lower throughput OK
- Redis Streams: In-memory speed, simpler setup
- NATS: Lightweight, low latency, embedded use cases
- Direct socket: Ultimate simplicity for point-to-point
Databases: SQL vs NoSQL vs File
PostgreSQL when:
- Need transactions and consistency
- Complex queries and joins
- Data has clear schema
Alternatives:
- SQLite: Single node, embedded, zero-config
- MongoDB: Schema flexibility, document-oriented
- InfluxDB: Time-series data (telemetry, logs)
- Flat files: Telemetry dumps, log rotation, simplicity
Networking: REST vs gRPC vs Custom
REST when:
- Human-readable debugging matters
- Widely supported clients
- Request/response pattern sufficient
Alternatives:
- gRPC: Binary efficiency, streaming, type safety
- MQTT: Pub/sub, low bandwidth (IoT, embedded)
- WebSockets: Bidirectional, real-time updates
- Raw TCP/UDP: Ultimate control and efficiency
Processing: Microservices vs Monolith
Microservices when:
- Large teams, independent deployments
- Different scaling needs per service
- Polyglot requirements (multiple languages)
Monolith when:
- Small team (< 10 people)
- Simple deployment preferred
- Network latency matters
- Starting new project (defer complexity)
Documentation as Integration Tool
Essential Integration Documents
- Interface Control Documents (ICDs): Sacred contracts between subsystems. Version controlled, reviewed, signed-off.
- Integration procedures: Step-by-step instructions. Not "plug it in and see," but detailed, ordered, verified sequences.
- Test plans: What to verify at each integration milestone. Clear pass/fail criteria, no ambiguity.
- Troubleshooting guides: Common failure modes and diagnostic steps. "If X fails, check Y, measure Z."
- Configuration management: Track hardware revisions, software versions, which combinations work together.
- Lessons learned log: Capture surprises, workarounds, and root causes for next time.
Don't wait until after integration to document what you learned. Capture it immediately: "We had to add a 10µF cap on the 3.3V rail to fix SPI glitches." Six months later, you won't remember. Write it down now, in the ICD, with date and initials.