Skip to main content

14 posts tagged with "haskell-simulation"

View All Tags

Weekly Summary – May 12, 2025

· 2 min read
William Wolff
Architect

This week, the team made significant progress on simulation improvements, trace verification, and a comprehensive analysis of Leios' transaction processing capacity.

Trace verification

  • Improved the trace verifier with better error handling and reporting
  • Added support for starting verification from non-initial states
  • Created manually curated test cases for the Leios trace verifier
  • Integrated the trace verifier into Nix infrastructure and CI builds
  • Removed deterministic conformance testing in favor of a trace-based approach.

Simulation improvements

Haskell simulation

  • Conducted an informal review assessing code quality, design, and implementation
  • Analyzed the simulation organization and identified areas for future improvement
  • Found that most prospective changes to the Leios protocol would only involve a small fraction of the codebase
  • Determined that adding memory pool and transactions would take approximately 100-200 hours of labor.

The review of the Haskell simulator was documented in detail in PR#353, covering its statistics, organization, code quality, design, implementation, and documentation aspects.

Rust simulation

  • Added tx-start-time and tx-stop-time parameters to avoid effects of slow starts or sudden terminations on transaction analysis
  • Created a new Leios variant full-without-ibs where endorser blocks directly reference transactions.

Documentation and analysis

The team ran higher excess-capacity simulations to test hypotheses about transaction inclusion. The transaction lifecycle simulations raised the question of whether duplicated transactions in IBs were preventing other transactions from ever being included. The team ran simulations with IBs produced at three times the normal rate to test this, providing ample space for transaction duplication.

Detailed analysis showed that transaction loss persisted despite increased capacity, indicating that other factors are preventing transactions from reaching the ledger. The results are documented in:

Weekly Summary – April 14, 2025

· 3 min read
William Wolff
Architect

This week, the team made substantial progress in both the Haskell and Rust simulations, refined cost estimates, and carried out detailed analyses of the transaction lifecycle and Full Leios simulations.

Simulation improvements

Haskell simulation

  • Completed the first draft of new mini protocols for Leios diffusion
    • Modeled protocols after block-fetch and node-to-node transaction submission from ouroboros-network
    • Included IB relay, EB relay, and vote relay for header diffusion and body announcements
    • Included IB fetch and EB fetch for body diffusion
    • Worked on the CatchUp protocol for older blocks
    • See simulation/docs/network-spec for full protocol details
  • Renamed short-leios command to leios as it now covers the full variant as well
    • short-leios is kept as alias for compatibility.

Rust simulation

  • Fixed conformance issues with the shared trace format
  • Fixed a bug in the voting logic that prevented EBs from receiving enough votes to be included on-chain
  • Updated visualizations to use smaller trace files in preparation for hosting on the documentation site.

Revisions to the cost dashboard

The cost dashboard was updated with lower and more realistic IO estimates.

Transaction lifecycle analysis

The Jupyter notebook Analysis of transaction lifecycle estimates the delay introduced at each of the seven stages of Full Leios as a transaction progresses from the memory pool to being referenced by a Praos block.

Key findings from the analysis:

  1. Reducing stage lengths below 10 slots offers little benefit
  2. The number of shards should remain low enough to maintain a high IB rate per shard relative to the stage length
  3. Low EB rates result in many orphaned IBs
  4. With realistic parameters, the delay from transaction submission to its inclusion in an RB is approximately two minutes.

Potential next steps:

  • Translate the model into Delta QSD to capture network effects
  • Compare the model's output with results from the Rust simulator
  • Extend the model to account for different memory pool and ledger variants under evaluation.

Simulation and analysis of Full Leios

The team conducted comprehensive simulations using both Haskell and Rust simulators at tag leios-2025w16. The simulations covered 648 scenarios of Full and Short Leios with varied parameters:

  • IB production rate
  • IB size
  • EB production rate
  • Stage length
  • CPU constraints.

Two new output files were generated:

  1. A summary of network, disk, and CPU resource usage over the course of the simulation
  2. The vertices and edges of the Leios graph, showing linkages between transactions, IBs, EBs, RBs, and votes (can be visualized as an interactive web page).

Key findings:

  • The Rust and Haskell simulations show generally close agreement
  • The Haskell simulation encounters network congestion at 16 IB/s, while the Rust simulation does not
  • The Rust simulation consumes more CPU at high IB rates than the Haskell simulation
  • In some cases, the Rust simulation does not produce enough votes to certify an EB.

Detailed results are available in the Jupyter notebook analysis/sims/2025w16/analysis.ipynb.

Weekly Summary – April 7, 2025

· 3 min read
William Wolff
Architect

This week, the team continued refining the protocol and its simulation capabilities, making significant progress in addressing various topics.

Simulation improvements

Haskell simulation

  • Started specifying a new relay protocol for IB header diffusion without the body
  • Improved the shared log format by removing redundancies and harmonizing naming
  • Added support for additional events required by conformance testing, including SlotEvent and NoBlockEvent
    • These events can be enabled using the --conformance-events flag with --shared-log-format.

Rust simulation

  • Updated traces to match the new standardized trace format
  • Fixed a critical bug in CPU scheduling where nodes were using more cores than allocated.

Analysis of workflow optimization

The team significantly improved the workflow for analyzing both Haskell and Rust simulations:

  • Replaced MongoDB with more efficient jq queries using map-reduce operations
  • Created reusable library functions for plotting with R
  • Revised and streamlined scripts for creating, executing, and analyzing simulations
  • Made the Jupyter notebook for analyses more generic and reusable
  • Successfully tested the new workflow on tag leios-2025w15.

These improvements will enable faster setup and execution of future simulation experiments, with quicker turnaround times for analysis. During this optimization work, several discrepancies between the Haskell and Rust simulations were identified and documented as GitHub issues for future investigation.

Edinburgh workshop recaps

The Edinburgh workshop documentation has been made available, covering key discussions and decisions:

Day 1 highlights

  • Explored ledger design options comparing labeled UTXOs (explicit shards) vs accounts (implicit shards) approaches
  • Discussed conformance testing strategies including QuickCheck dynamic and trace verification approaches
  • Analyzed critical edge cases for user onboarding and system properties.

Day 2 highlights

  • Conducted a detailed analysis of Leios node costs across different TPS levels
  • Key findings on resource usage:
    • At 10 TPS: 1.8x increase in egress and 6x increase in compute compared to Praos
    • At 1K TPS: significant scaling improvements with better resource efficiency
  • Provided recommendations for potential integration with Peras, particularly to optimize the voting mechanism
  • Discussed performance characteristics at both high and low throughput levels.

Day 3 highlights

  • Held an in-depth discussion on optimistic ledger state references, exploring three main approaches:
    1. RB reference: highest security but highest latency
    2. EB reference: balanced approach with medium security and latency
    3. EB-DAG: advanced approach using directed acyclic graph structure
  • Key advantages of the EB-DAG approach:
    • Achieves low latency while maintaining security
    • Provides strong inclusion guarantees for EBs
    • Enables efficient state management and reconstruction
    • Creates a complete, verifiable chain history
  • Discussed implementation considerations for state management and block ordering under the EB-DAG model.

For more information, please see the full workshop recaps in the Leios documentation.

Weekly Summary – March 31, 2025

· 3 min read
William Wolff
Architect

This week, the Leios team met for an in-person workshop in Edinburgh and continued their efforts in refining the protocol and its simulation capabilities. The team made significant progress in addressing various topics.

On day one, the team discussed topics such as ledger design and trade-offs, as well as two different ways to link the formal specification to the simulations. They explored various ledger design options, including labeled UTXOs and accounts approaches, with detailed consideration of fees, collateral, and conflict prevention mechanisms. The team also discussed conformance testing approaches, including QuickCheck dynamic and trace verification methods.

On day two, the team made significant progress towards estimating the cost of running a Leios node, considering different cost items such as network egress, CPU, and storage. They analyzed resource usage across different TPS levels, from 10 TPS to 1K TPS, and discovered that while there’s significant overhead at low throughput, the protocol becomes more efficient at higher TPS levels. The team hasn’t been able to finish all the cost items yet. The last two, IOPS and memory cost, will be added during this month.

On the last and third day, the team consolidated their options for how optimistic validation of input blocks can be accomplished. They defined three candidates, with one being favored. The main goal was to support the chaining of transactions with Leios, which requires defining a 'point in time' or stage of the protocol at which a subsequent or chained transaction can be built on top of an already submitted transaction. This can be achieved by having the node optimistically compute prospective ledger states using its local knowledge of input blocks referenced in certified endorser blocks or possibly ranking blocks.

Simulation progress

  • Haskell simulation
    • Added support for dishonest nodes that diffuse an unbounded amount of old IBs, enabling further analysis of freshest-first and oldest-first vote delivery scenarios
    • Identified and fixed a bug in configuration generation for simulation runs, which was causing inconsistencies in vote delivery between default and uniform/extended voting schemes
    • Added an adversarial field to the network topology schema, allowing for the simulation of unbounded IB diffusion by dishonest nodes.

Ongoing investigations

  • Investigating the effects of unbounded IB diffusion on IB delivery reliability and protocol performance under such conditions
  • Working on quantifying settlement times and their impact on protocol performance
  • Exploring integration possibilities with Ouroboros Peras, mainly focusing on potentially reusing their voting mechanism to reduce resource consumption.

Additional resources

Weekly Summary – March 24, 2025

· 2 min read
William Wolff
Architect

This week, the Leios team continued working on various aspects of the protocol and its simulation capabilities. They made progress in implementing and testing the Haskell and Rust simulators, focusing on protocol behavior under different network conditions.

Simulation progress

  • Haskell simulation

    • Moved configuration and topology parsers to the leios-trace-hs package for reuse in formal methods
    • Investigated differences in IBs referenced with Rust simulation: identified that inconsistencies were caused by the same sequence of random samples being used across different runs
    • Simplified sortition code by using an external statistics package
    • Tested Full Leios, resolving tension between r_EB/eb-max-age-slots and praos-chain-quality/η
    • Fixed cabal run ols -- generate-topology close-and-random, listing producers properly and decreasing variance in upstream peers.
  • Rust simulation

    • Investigated anomalies in simulation results: identified that earlier IB production failures were caused by low connectivity and lower CPU usage compared to the Haskell simulation
    • Refined Full Leios implementation
    • Added Full Leios support to the visualizer
    • Migrated the visualizer from Next.js to Vite.

Analysis of simulations

  • Tag leios-2025w13: simulated 198 Short Leios scenarios, varying IB production rate, IB size, network topology, CPU limits, and protocol flags
  • CPU limits: analyzed the impact of CPU constraints on IB propagation, finding that diffusion can be affected under stress conditions
  • Vote propagation: compared freshest-first and oldest-first vote propagation, with freshest-first potentially improving IB delivery reliability
  • Extended voting period: compared an extended voting period to a limited one in the Haskell simulation, observing minimal differences except for occasional improvements in reliable vote delivery.

Ongoing investigations

  • Investigating qualitative discrepancies between Haskell and Rust simulation results to determine whether they stem from differences in simulator resolution or simulation infidelities.

Additional resources

Weekly Summary – February 10, 2025

· 3 min read
William Wolff
Architect

This week, the Leios team made significant progress across multiple areas. Major developments included detailed DeltaQ analysis of network topologies, extensive BLS cryptography benchmarking, and improvements to both simulations. The team also explored succinct schemes for BLS key registration and conducted a detailed certificate performance analysis. Both Haskell and Rust simulations received substantial updates to improve visualization and support more realistic testing conditions.

DeltaQ analysis

  • Enhanced the topology-checker with ΔQSD analysis capabilities:
    • Extracts inter-node latencies from given topologies
    • Classifies latencies into near/far components
    • Builds parameterized ΔQ models
    • Outputs fitted models in delta_q web app syntax
  • Key findings from topology analysis:
    • Clear distinction between near/far components in examined topologies
    • Unexpectedly high hop counts in latency-weighted Dijkstra paths:
      • Min 4-5, max 8 for topology - 100
      • Min 8, max 20 for 'realistic' topology
    • Model fitting achieved rough shape matching but showed significant deviations at low latencies
    • Resource usage tracking goals remain unmet due to complexity in understanding load multiplication factors.

BLS cryptography

  • Completed comprehensive benchmarking of certificate operations:
    • Detailed performance analysis across committee sizes (500-1000 seats)
    • Certificate generation: 63.4ms - 92.5ms
    • Certificate verification: 104.8ms - 144.9ms
    • Certificate weighing: ~12ms consistently
  • Explored succinct schemes for key registration:
    • Proposed 90-day key evolution with 124-byte KZG commitments
    • Analyzed message sizes for key opening (316 bytes per pool)
    • Investigated SNARK-based alternatives for proof of possession
  • Added BLS crypto to the CI pipeline with automated testing
  • Documented parallelization strategies for certificate operations.

Formal methods

  • Added a conformance testing client for the executable Short Leios specification
  • Successfully merged the executable specification for Simplified Leios into main.

Haskell simulation

  • Updated configuration defaults for block sizes and timings
  • Added support for idealized simulation conditions:
    • Single-peer block body requests
    • TCP congestion window modeling
    • Mini-protocol multiplexing
    • Unlimited bandwidth links support
  • Enhanced simulation output and analysis:
    • Added raw field for accumulated data
    • Implemented block diffusion CDF extraction
    • Created multi-CDF plotting capabilities.

Rust simulation

  • Enhanced visualization capabilities:
    • Added block size breakdown display
    • Implemented total bytes sent/received tracking
    • Added total TX count and CPU time metrics
  • Improved event handling:
    • Updated to standard timestamp format (seconds)
    • Enhanced CPU task event structure
    • Added CBOR output support
  • Added support for multiple strategies:
    • Implemented ib-diffusion-strategy (freshest-first, oldest-first, peer-order)
    • Added relay-strategy affecting TXs, IBs, EBs, votes, and RBs
    • Enabled unlimited EB and vote bundle downloads from peers.

Weekly Summary – February 3, 2025

· 2 min read
William Wolff
Architect

This week, the Leios team worked on cryptography benchmarking and cost calculator improvements. The team completed a reference implementation for Leios cryptography and enhanced the online cost calculator with user-requested features. They also updated both Haskell and Rust simulations to improve visualization and network modeling capabilities.

Haskell simulation

  • Added support for Send and Receive voting stages, providing:
    • A new leios-vote-send-recv-stages configuration option
    • A configurable stage length via leios-stage-active-voting-slots
  • Implemented multiple diffusion strategies:
    • Added oldest-first strategy
    • Added configurable strategies for IBs, EBs, and votes via *-diffusion-strategy configurations
  • Created a new small scenario for 100 nodes with 2,000 kB links
    • Tuned IB parameters to utilize one-third of link capacity
    • Added configurations for both single-stage and send-recv voting
  • Fixed several simulation behaviors:
    • Improved block generation logic
    • Prevented duplicate EB inclusion in the base chain
    • Confirmed proper EB inclusion timing relative to vote diffusion
  • The main difference observed between single-stage and send-recv is that the former shows a longer tail in the CPU usage CDF when the simulation is run with unlimited cores.

Cryptography implementation

The Rust benchmarks for Leios cryptography were redesigned as a reference implementation:

  • Implemented the Fait Accompli sortition
  • Enhanced sortition to use rational arithmetic instead of quad-precision floats
  • Added Quickcheck tests for all capabilities
  • Added benchmarks for serialization
  • Optimized vote and certificate size.

Cost calculator improvements

The team enhanced the online Leios cost calculator:

  • Added support for both hyperscale and discount cloud providers
  • Made discount providers the default option
  • Added option to amortize storage costs perpetually
  • Updated defaults:
    • Single relay deployment
    • More conservative 50% disk compression
    • Perpetual storage cost amortization.

Throughput simulator

The team updated the Cardano throughput simulator with:

  • The latest cloud-computing cost model
  • Synchronized assumptions with an online cost calculator.

Rust simulation

  • Made minor fixes to the new graph generation strategy
  • Planned out a roadmap for visualization work focusing on the Leios transaction lifecycle.

Weekly Summary – January 27, 2025

· 2 min read
William Wolff
Architect

The Leios team continued refining Haskell and Rust simulations, standardizing inputs, outputs, and event logging for better comparability. The team defined standard formats for configuration parameters and network topology for running the Leios protocol. They also worked on logging identical simulation events to compare and feed them into the DeltaQ model and, consequently, the executable specification, ensuring alignment with formal methods.

Haskell simulation updates

  • The short-leios simulation now outputs diffusion latency data
  • Added support for different input block (IB) diffusion strategies:
    • freshest-first: higher slot numbers requested first
    • peer-order: requested in order of peer announcement
  • Added support for Vote (Send) and Vote (Recv) stages.

Rust simulation progress

  • Added an 'organic' topology generator that better matches mainnet topology
  • The generator creates clusters of colocated stake pools and relays
  • The simulation uses stake to determine relay connectivity
  • Topology insights gathered from stake pool owners:
    • Most pools have multiple relays (2,312 relays across 1,278 pools)
    • Pool operators often run multiple colocated pools sharing relays
    • Relays typically maintain ~25 active outgoing connections
    • Incoming connections scale with stake weight (10-400+ connections).

DeltaQ update

  • Wrote a comprehensive 2025-01 report covering work since September 2024.

Formal methods

  • Finalizing executable specifications for simplified and short Leios
  • Extracted short Leios specification to Haskell for conformance testing.

Weekly Summary – January 20, 2025

· 2 min read
William Wolff
Architect

Simulation progress

Haskell implementation

  • Enhanced parameter handling with support for reading configurations and topologies from disk
  • Added a new generate-topology command for random topology generation
  • Aligned Leios sortition with algorithms from sortition benchmarks and the technical report
  • Completed analysis comparing the Praos simulation with the benchmark cluster
    • Adoption times within 10% of measured values
    • Review of simulation parameters pending
  • Next steps identified:
    • Generate topologies with block producers behind relays
    • Begin comparison with the idealized diffusion model
    • Configure and run simulations for higher throughput.

Rust implementation

  • Completed the first pass of block-level visualization
  • Updated topology files to include baked-in latencies
  • Improved output with human-readable names from the shared topology format
  • Enhanced simulation output comparability across different simulations.

Analysis and research

Sortition analysis

  • Completed a detailed analysis of the 'Fiat Accompli' sortition scheme using mainnet stake distribution (Epoch 535)
  • Key findings for 500-vote committees:
    • 406 largest stake block-producers would be deterministic voters
    • ~88 voters would be randomly selected
    • Significant certificate size reduction achieved through deterministic voter selection.

Downstream impact assessment

Started comprehensive analysis of Leios's impact on the ecosystem:

  • Identified impacts on indexers, explorers, SDKs, and APIs resulting from ledger and node changes
  • Transaction construction and memory-pool sharding effects on DApps and wallets
  • Physical layer visibility considerations for sophisticated use cases
  • High throughput implications for event filtering efficiency
  • Transaction journey time considerations from memory pool to Praos block reference.

DeltaQ analysis

  • Successfully matched ΔQ model for IB diffusion across both simulations and implementations
  • Identified key differences in simulation approaches:
    • Haskell simulation includes bandwidth effects (328ms network delay per hop at 1MB/s)
    • Rust simulation currently excludes bandwidth effects
  • Enabled cross-simulation topology sharing for consistent testing.

Weekly Summary – January 13, 2025

· 2 min read
William Wolff
Architect

Cryptography benchmarks

  • Implemented and benchmarked the complete Leios cryptography suite in the leios_crypto_benchmarks Rust crate
  • Key VRF performance metrics:
    • Proving: 240 µs
    • Verifying: 390 µs
  • Sortition performance (excluding VRF):
    • Leadership checks (RB/IB/EB): 0.17 µs per slot/pipeline
    • Vote number calculation: 3.8 µs per pipeline
  • BLS operations benchmarked:
    • Key possession proof verification: 1.5 ms per key
    • Vote generation/verification: 280 µs / 1.4 ms per vote
    • Certificate operations (300-vote quorum): 50 ms generation, 90 ms verification.

Cryptography design progress

  • Optimized vote signature size to potentially as small as 192 bytes
  • Determined that 500-vote committee certificates (60% quorum) would fit within Praos blocks at ~58 kB
  • Explored potential synergies with KES rotation and Praos VRF BLS keys
  • Completed cryptography sections for the first technical report
  • Decision made to freeze current report content and move new findings to future documents.

Simulation development

Haskell simulation

  • Achieved diffusion latency comparable to benchmark cluster data for Praos blocks
  • Integrated agreed-upon simulation parameters with the Rust team
  • Added event log output functionality with JSON support
  • Implemented 'short-leios' simulation variant matching mainnet ranking block interval
  • Fixed coordination issues in Relay mini-protocol consumers
  • Completed the PI goal by adding total data transmitted per node visualization.

Rust simulation

  • Implemented more granular CPU simulation times
  • Fixed race condition in the simulated clock
  • Started consuming a new shared configuration file format
  • Established a shared configuration format with default parameters in data/simulation/default.yaml.