ResearchHub | Open Science Community

PG

P. Godfrey

Author with expertise in Software-Defined Networking and Network Virtualization

Achievements

Cited Author

Open Access Advocate

Key Stats

Upvotes received:

0

Publications:

10

(50% Open Access)

Cited by:

3,970

h-index:

39

/

i10-index:

61

Reputation

Biology

< 1%

Chemistry

< 1%

Economics

< 1%

Show more

How is this calculated?

Publications

Network Coding for Distributed Storage Systems

Alexandros Dimakis et al.Aug 19, 2010

Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peer-to-peer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, requires less redundancy than simple replication for the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate encoded fragments in a distributed way while transferring as little data as possible across the network. For an erasure coded system, a common practice to repair from a single node failure is for a new node to reconstruct the whole encoded data object to generate just one encoded block. We show that this procedure is sub-optimal. We introduce the notion of regenerating codes, which allow a new node to communicate functions of the stored data from the surviving nodes. We show that regenerating codes can significantly reduce the repair bandwidth. Further, we show that there is a fundamental tradeoff between storage and repair bandwidth which we theoretically characterize using flow arguments on an appropriately constructed graph. By invoking constructive results in network coding, we introduce regenerating codes that can achieve any point in this optimal tradeoff.

Theoretical Computer Science

Computer Networks And Communications

0

Paper

Theoretical Computer Science

Save

Finishing flows quickly with preemptive scheduling

Chi-Yao Hong et al.Aug 13, 2012

Today's data centers face extreme challenges in providing low latency. However, fair sharing, a principle commonly adopted in current congestion control protocols, is far from optimal for satisfying latency requirements.

Information Systems

Computer Networks And Communications

0

Paper

Information Systems

Save

Network Coding for Distributed Storage Systems

Alexandros Dimakis et al.Jan 1, 2007

Peer-to-peer distributed storage systems provide reliable access to data through redundancy spread over nodes across the Internet. A key goal is to minimize the amount of bandwidth used to maintain that redundancy. Storing a file using an erasure code, in fragments spread across nodes, promises to require less redundancy and hence less maintenance bandwidth than simple replication to provide the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate a new fragment in a distributed way while transferring as little data as possible across the network. In this paper, we introduce a general technique to analyze storage architectures that combine any form of coding and replication, as well as presenting two new schemes for maintaining redundancy using erasure codes. First, we show how to optimally generate MDS fragments directly from existing fragments in the system. Second, we introduce a new scheme called regenerating codes which use slightly larger fragments than MDS but have lower overall bandwidth use. We also show through simulation that in realistic environments, regenerating codes can reduce maintenance bandwidth use by 25% or more compared with the best previous design - a hybrid of replication and erasure codes - while simplifying system architecture.

Computer Networks And Communications

Computer Science

0

Paper

Computer Networks And Communications

Save

VeriFlow

Ahmed Khurshid et al.Aug 13, 2012

Networks are complex and prone to bugs. Existing tools that check configuration files and data-plane state operate offline at timescales of seconds to hours, and cannot detect or prevent bugs as they arise.

Computer Networks And Communications

0

Paper

Save

Debugging the data plane with anteater

Haohui Mai et al.Aug 15, 2011

Diagnosing problems in networks is a time-consuming and error-prone process. Existing tools to assist operators primarily focus on analyzing control plane configuration. Configuration analysis is limited in that it cannot find bugs in router software, and is harder to generalize across protocols since it must model complex configuration languages and dynamic protocol behavior.

Theoretical Computer Science

0

Paper

Theoretical Computer Science

Save

Low latency via redundancy

Ashish Vulimiri et al.Dec 4, 2013

Low latency is critical for interactive networked applications. But while we know how to scale systems to increase capacity, reducing latency --- especially the tail of the latency distribution --- can be much more difficult. In this paper, we argue that the use of redundancy is an effective way to convert extra capacity into reduced latency. By initiating redundant operations across diverse resources and using the first result which completes, redundancy improves a system's latency even under exceptional conditions. We study the tradeoff with added system utilization, characterizing the situations in which replicating all tasks reduces mean latency. We then demonstrate empirically that replicating all operations can result in significant mean and tail latency reduction in real-world systems including DNS queries, database servers, and packet forwarding within networks.

Computer Networks And Communications

Computer Science

0

Paper

Computer Networks And Communications

Save

DRILL

Soudeh Ghorbani et al.Aug 4, 2017

The trend towards simple datacenter network fabric strips most network functionality, including load balancing, out of the network core and pushes it to the edge. This slows reaction to microbursts, the main culprit of packet loss in datacenters. We investigate the opposite direction: could slightly smarter fabric significantly improve load balancing? This paper presents DRILL, a datacenter fabric for Clos networks which performs micro load balancing to distribute load as evenly as possible on microsecond timescales. DRILL employs per-packet decisions at each switch based on local queue occupancies and randomized algorithms to distribute load. Our design addresses the resulting key challenges of packet reordering and topological asymmetry. In simulations with a detailed switch hardware model and realistic workloads, DRILL outperforms recent edge-based load balancers, particularly under heavy load. Under 80% load, for example, it achieves 1.3-1.4x lower mean flow completion time than recent proposals, primarily due to shorter upstream queues. To test hardware feasibility, we implement DRILL in Verilog and estimate its area overhead to be less than 1%. Finally, we analyze DRILL's stability and throughput-efficiency.

Information Systems

Computer Networks And Communications

0

Paper

Information Systems

Save

TraceWeaver: Distributed Request Tracing for Microservices Without Application Modification

Sachin Ashok et al.Aug 4, 2024

Computer Networks And Communications

Computer Science

0

Paper

Computer Networks And Communications

Computer Science

Save

Opportunities and Challenges in Service Layer Traffic Engineering

Geun-Jho Lim et al.Nov 11, 2024

Information Systems

0

Paper

Information Systems

Save

Lightweight Automated Reasoning for Network Architectures

Rahul Bothra et al.Nov 11, 2024

Electrical And Electronic Engineering

Computer Networks And Communications

0

Paper

Electrical And Electronic Engineering

Computer Networks And Communications

Save