wiki:coin

Version 6 (modified by marie@…, 11 months ago) (diff)

--

Computing in the Network (COIN)

Introduction

We are experiencing a convergence between the concepts of networking and computing, triggered not only by the softwarization of networking functions (SDN, NFV) but also by the evolution of the network architecture. The move to edge distributed computing/networking is also encouraging the development of local networking and computing facilities to support low delay and low loss services that are emerging from AR/VR, autonomous vehicles and intelligent/smart cities. Consequently the idea of a “programmable” network is central to the evolution of the Internet. Programmable data planes are now available with acceptable and ever better performance. Programmable switches and abstraction such as P4 l in Data Center Networks and the rise of virtual network devices in NFV in both DCN and carrier’s network just confirm this evolution as are the programmable network processing units (NPUs) in traditional routers that also programmable in some degree.

Motivation

Distributed computing in the network provides new opportunities to enhance performance and availability as well as to develop new types of networked applications and systems.

Examples include:

1) Data plane performance The research community is active in finding innovations that use in-network compute and cache capability and expand the reach of programmable control and data planes to improve the performance of distributed systems. Examples include deep neural network (DNN) training for management, control and adaptation, distributed key-value store, local loss coding (network and application layer codes) and distributed system consensus (for example PAXOS and blockchains). The results show that significant (up to 10x) performance improvements when compared with centralized solution that experience delay and losses due to bottleneck situation.

2) Decentralized, lightweight and dynamic computing In order to complement the more centralized data centers, there is a wide interest into decentralized computing such as edge/fog/pervasive (ubiquitous) computing that profit from proximity to the end user to support local functionality or enhance DC services. In parallel, “serverless” and FaaS (function as a Service) are changing the nature of the client-server model. The logic that was at the server side is moving to the client, while “application servers” are decomposed into “functions” that are triggered by client requests. Unlike deploying a server or virtual machine or even a container, the computing load for these “functions” can be lightweight and can instantiated in 100ms level when they are requested.

All these new trends together are enabling innovative research and even a re-definition of what it means to be “in the network” and providing the tools to develop new networking and computing paradigms in a distributed architecture.

Research Challenges

The research questions that the COIN group wants to address include but are not limited to: (1) Even within the traditional and "end-to-end argument", will distributed computing in the network provide enough motivation and benefits to justify the introduction of non-forwarding functions into the network?

(2) Will forwarding function be eventually be integrated into the computing paradigm for example ML for route determination?

(3) To which level must the abstraction of the programmable data plane be for a network with non-forwarding functions? With new functionalities, the network an infrastructure, need to be decoupled from some applications so that it can be kept stable with permanently evolving applications.

(4) What will the impact of these in-network functions on end-to-end transport protocols and security? Will transport start being hop-by-hop?

(5) With the network as a database what will be the impact on the privacy of users' data and identities.

(6) What are the economic, social and environmental incentives for the network to add new computing capabilities and resources in an open ecosystem.

COIN Objectives

1) Understanding the use cases and different types of network programmability and their different characteristics (for example, DC switch programmability vs. distributed/edge computing).

2) Investigating architectural questions such as system architecture and protocol designs for in-network computing, for example interactions of data and control planes as well as overall system and protocol security.

3) Understanding relationship to and impact on existing Internet protocols (transport, traffic steering) and frameworks (security, management).

4) Developing common terminology, concepts and potentially system elements such as data plane protocols and management concepts.

5) Providing guidance for potential future IETF work on distributed and in-network computing.

Draft Charter

The COIN research group wants to explore the research on how to make use of new programmable data planes and distributed computing to introduce non-forwarding functions into networks and functional federation to improve network and application performance and user experience.

In order to achieve this goal the methodology will include specific future-looking use cases with their outcomes, the trade-offs between the benefits from the new functionality in network and the extra cost to network devices, related research on edge computing applications that benefit from programmability and research on applications that could be moved into the network to provide added functionality. Use cases will include the collaboration between centralized and controlled environments like DCN, and the widely distributed networks characterized by edge/pervasive computing. While it is not mandatory it is hoped that later in the RGs work the combination of both approaches in a common architecture may lead to common protocols.

The use cases and related research may lead to new architecture and layering design with comparison to traditional architectures in terms of complexity, performance and cost and create incentivesthe research into new abstraction of the data plane and the development of some potential new protocols. Finally the impacts of COIN on transport protocols, security and privacy in different environments, and the incentives for both the network providers to provide the capabilities and the application developers to use them will also be investigated.

Scope

(1) Use case analysis/targeted research: DCN, edge networks, IoT networks etc. and the potential benefits to these networks from in-network non-forwarding functions like compute, cache, manage etc.

(2) Research on solutions to use current and coming programmable network devices to implement non-forwarding functions and demonstrate the relationship between performance and benefit/gains.

(3) Research on novel architectures, new data-plane abstraction and new protocol designs that make full use of the constrained compute and cache capabilities in programmable network devices and how to expand them,

(4) Research on novel architectures, new data-plane abstraction and new network protocol designs that can help to efficiently use the decentralized computing resources, inside the network devices in the DC, the core and the edge, or even in the end-user devices.

(5) Research on new transport protocol and new privacy and security mechanisms enabled by in-network non-forwarding functions.

(6) Research on incentive mechanisms to encourage both the network providers to provide the capabilities and the application developers to use them.

Outcomes

COIN wants to build a forum to explore and discuss how the network architectures and protocols will adapt to the introduction of distributed systems and decentralized computing resources. Hence the following outcomes are proposed:

(1) An informative RFC on COIN in Datacenters

(2)An informative RFC on COIN at the Edge

(3) An informative RFC on COIN in Networked Applications

<other specific contributions to be added with the help of the community - to come>

Initial Meeting: IETF 103 Bangkok (10-12 am room Boromphimarn 3)

Meeting Chairs Jeffrey He, Huawei and Marie-José Montpetit, TriangleVideo?

Draft Agenda

Welcome and agenda

Review of the agenda (Chairs - 5 minutes)

Presentation of the motivation and draft charter (Jeffrey He - 10 minutes)

Computing in the network

In-network overview (Robert Soule to be confirmed - 10-15 minutes)

draft-he-coin-datacenter-00 (Jeffrey He - 10 minutes)

Topic TBD (Marco Canini - 10-15 minutes)

Topic TBD (Dave Oran - 10-15 minutes)

Applications

Machine Learning (to be confirmed - Rachel Chen - 10 minutes)

draft-montpetit-coin-XR (MJM - 10 minutes)

Next steps (Chairs - 5 minutes)

Documents

In-Network Data-Center Computing - draft-he-draft-he-coin-datacenter-00 available at https://datatracker.ietf.org/doc/draft-he-coin-datacenter/

References

Chang, Michael Alan, Panda, Aurojit , Bottini, Domenic, Jian, Lisa Kumar, Pranay and Shenker, Scott, “Network Support for DNN Training”, SysML, Feb 2018, Palo Alto, California https://www.sysml.cc/doc/182.pdf

Dang Huynh Tu, Sciascia, Daniele, Canini, Marco, Pedone Fernando and Soulé, Robert, NetPaxos?: Consensus at Network Speed SOSR15 https://mcanini.github.io/papers/netpaxos.sosr15.pdf

Dang, Huynh, Canini, Marco, Pedone, Fernando and Soulé, Robert, "Paxos Made Switch-y", Sigcomm CCR 2016, /https://www.sigcomm.org/sites/default/files/ccr/papers/2016/April/0000000-0000002.pdf

Fan, Bin, Lim, Hyeontaek, David G. Andersen, David G. and Kaminsky, Michael, “Small Cache, Big Effect: Provable Load Balancing for Randomly Partitioned Cluster Services.” 2011.ACM SOCC www.istc-cc.cmu.edu/publications/papers/2011/loadbal-socc2011.pdf

Forster, Nate, <to come>

Graham, Richard L, et al. (16 authors) “Scalable Hierarchical Aggregation Protocol (SHArP): A Hardware Architecture for Efficient Data Reduction.” In COM-HPC, 2016. https://ieeexplore.ieee.org/document/7830486

Hadoop Distributed File System, http://hadoop.apache.org/

Jin, Xin, Li, Xiaozhou, Zhang, Haoyu, Robert Soulé, Robert, Lee, Jeongkeun, Foster, Nate, Kim, Changhoon and Stoica, Ion, “NetCache?: Balancing Key-Value Stores with Fast In-Network Caching", SOSP2017, https://www.cs.jhu.edu/~xinjin/files/SOSP17_NetCache.pdf

Li, Jialin, Michal, Ellis and Ports, Dan R.K. “Eris: Coordination-Free Consistent Transactions Using In-Network Concurrency Control”, (University of Washington) SOSP 2017 https://syslab.cs.washington.edu/papers/eris-sosp17.pdf

Li , Xiaozhou, Sethi, Raghav, Kaminsky, Michael, Andersen, David G and Freedman, Michael J." Be fast, cheap and in control with SwitchKV", NSDI'2016, https://dl.acm.org/citation.cfm?id=2930614

P4, p4.org

Ports, Dan R.K., Li, Jialin, Liu, Vincent, Sharma, ViNaveen? Kr. and Krishnamurthy, Arvind, "Designing Distributed Systems Using Approximate Synchrony in Data Center Networks”, (University of Washington) NSDI 2015. https://www.usenix.org/node/188949

Rexford, Jennifer, Sigcomm 2018 Keynote Address, https://youtu.be/t_5__v6CNYE?t=4652

Soulé, Robert, Netcompute Workshop Keynote Address, Sigcomm 2018, http://conferences.sigcomm.org/sigcomm/2018/files/slides/netcompute/2018-08-20-sigcomm.pdf

Sapio, Amadeo, Abdelaziz Ibrahim, Aldilaijan, Abdulla, Canini, Mario and Kalnis Panos, " In net computing is a dumb idea whose time has come ", Hotnets 2017 https://dl.acm.org/citation.cfm?id=3152461

Subedi, Tara Nath, Nguyen, Kim Khoa and Chériet, Mohamed, “OpenFlow?-based in-network Layer-2 adaptive multipath aggregation in data centers”, Computer Communications, Volume 61, May 2015, https://www.sciencedirect.com/science/article/pii/S0140366414003715

Zsolt, István, Sidler, David, Alonso, Gustavo and Vukolić, Marko, “Consensus in a Box", NSDI 2016, https://dl.acm.org/citation.cfm?id=2930639

Attachments (16)