Opened 3 years ago

Last modified 2 years ago

#24 new defect

Evaluation / Testing results

Reported by: wes@… Owned by: draft-ietf-tsvwg-l4s-arch@…
Priority: major Milestone: L4S Suite - WGLC Preparation
Component: l4s-arch Version:
Severity: - Keywords:

Description (last modified by wes@…)

There are questions around testing results, and specific scenarios and comparisons to look at.

  • Typically comparisons against CUBIC have been used
  • There is interest in other delay-based algorithms (e.g. BBRv2)

Bob has shared a draft paper submission.

Sebastian, Dave, and others have discussed desired test scenarios of interest.

During the IETF 105 week, a testbed was setup for collaboration testing L4S and SCE.

Some working group participants have an interest to support ease of recreating or comparing results.

This is closely related to the issue on implementation status.

This was discussed at IETF 105 (issue "I").

Change History (3)

comment:1 Changed 3 years ago by wes@…

  • Description modified (diff)
  • Summary changed from Testing results to Evaluation / Testing results

comment:2 Changed 3 years ago by chromatix99@…

The biggest issue is the lack of testing against Codel AQMs, especially in competition with conventional TCPs.

comment:3 Changed 2 years ago by ietf@…

Here is a brief historical catch-up of the evaluation results published in the 4yrs before you started taking note, including comparisons of DualPI2 with FQ-Codel and using 'conventional TCPs'. Before even bringing L4S to the IRTF (in Mar 2005), let alone the IETF (in Jul 2005), we took great care to conduct tens of thousands of tests over a range of link rates, RTTs and traffic models, and to publish our full experimental set up so that our tests could be independently validated.

In Mar 2015, we presented evaluation of the DualQ Coupled AQM in the IRTF iccrg: However, these results did not include comparison with Codel or FQ_CoDel, or use of DCTCP/TCP Prague with FQ_CoDel or Codel.

In Jun 2015, we published this tech report giving extensive comparative evaluation of DualQ against FQ_CoDel (and other AQMs) and comparing use of DCTCP and Cubic together and separately in each AQM for a matrix of different numbers of each flow and other short flow traffic models: Many more experiments were conducted than those presented in this summary, with similar results, including Reno as well as Cubic, both ECN-capable and not, and PIE and ARED AQMs, in all the AQM/CC combinations not shown as well as those shown. The presentation in the AQM WG proceedings (linked below) points to this paper on the first slide and includes numerous slides summarizing these comparative evaluations: There was also a live demo in that session, switching between DualQ and FQ_CoDel, and a side-by-side demonstration. The whole thing was recorded in the IETF meetecho service, conveniently linked from the L4S landing page: (but currently timing out). These results were also referred to repeatedly up to and during the L4S BoF in Jul'16.

Over the years since then, as each minor change was made to the code, we always checked for regression against these results. In Jul 2019, we posted a link to this updated paper on the tsvwg and other lists: which is also conveniently linked from the L4S landing page: As before, it compares DualPI2 with FQ_CoDel and other AQMs. In order to summarize the very large number of experiments, it gives the salient parameters of the statistical distribution of the results, rather than the numerous CDF plots and time series plots in the previous paper.

Note: See TracTickets for help on using tickets.