Recon: Verifying File System Consistency at Runtime
File system bugs that corrupt file system metadata on disk are
insidious. Existing file-system reliability methods, such as
checksums, redundancy, or transactional updates, merely ensure that
the corruption is reliably preserved. The typical workarounds, based
on using backups or repairing the file system, are painfully slow.
Worse, the recovery is performed long after the original error
occurred and thus may result in further corruption and data loss.
We present a system called Recon that protects file system metadata
from buggy file system operations. Our approach leverages modern file
systems that provide crash consistency using transactional updates. We
define declarative statements called consistency invariants for a file
system. These invariants must be satisfied by each transaction being
committed to disk to preserve file system integrity. Recon checks
these invariants at commit, thereby minimizing the damage caused by
buggy file systems.
The major challenges to this approach are specifying invariants and
interpreting file system behavior correctly without relying on the
file system code. Recon provides a framework for file-system specific
metadata interpretation and invariant checking. We show the
feasibility of interpreting metadata and writing consistency
invariants for the Linux ext3 file system using this framework. Recon
can detect random as well as targeted file-system corruption at
runtime as effectively as the offline e2fsck file-system checker, with
Cosmic Rays Don't Strike Twice: Understanding the Nature of DRAM Errors and the Implications for System Design
Main memory is one of the leading hardware causes for machine crashes
in today's datacenters. Designing, evaluating and modeling systems that
are resilient against memory errors requires a good understanding of the
underlying characteristics of errors in DRAM in the field. While there
have recently been a few first studies on DRAM errors in production systems,
these have been too limited in either the size of the data set or the
granularity of the data to conclusively answer many of the open questions
on DRAM errors. Such questions include, for example, the prevalence of soft
errors compared to hard errors, or the analysis of typical patterns of hard errors.
In this project, we study data on DRAM errors collected on a diverse range of
production systems in total covering nearly 300 terabyte-years of main memory.
As a first contribution, we provide a detailed analytical study of DRAM error
characteristics, including both hard and soft errors. We find that a large
fraction of DRAM errors in the field can be attributed to hard errors and we
provide a detailed analytical study of their characteristics. As a second contribution,
we use the results from the measurement study to identify a number of promising
directions for designing more resilient systems and evaluate the potential of
different protection mechanisms in light of realistic error patterns. One of our
findings is that simple page retirement policies might be able to mask a large number
of DRAM errors in production systems, while sacrificing only a negligible fraction
of the total DRAM in the system.
Understanding Network Failures in Data Centers:
Measurement, Analysis, and Implications
We present the first large-scale analysis of failures in a data center network.
Through our analysis, we seek to answer several fundamental questions: which
devices/links are most unreliable, what causes failures, how do failures impact
network traffic and how effective is network redundancy? We answer these questions
using multiple data sources commonly collected by network operators. The key
findings of our study are that (1) data center networks show high reliability,
(2) commodity switches such as ToRs and AggS are highly reliable, (3) load
balancers dominate in terms of failure occurrences with many short-lived software
related faults, (4) failures have potential to cause loss of many small packets
such as keep alive messages and ACKs, and (5) network redundancy is only 40%
effective in reducing the median impact of failure.
Comprehensive Kernel Instrumentation via Dynamic Binary Translation
Dynamic binary translation (DBT) is a powerful technique that enables fine-grained
monitoring and manipulation of an existing program binary. At the user level, it
has been employed extensively to develop various analysis, bug-finding, and security
tools. Such tools are currently not available for operating system (OS) binaries
since no comprehensive DBT framework exists for the OS kernel. To address this problem,
we have developed a DBT framework that runs as a Linux kernel module, based on the
user-level DynamoRIO framework. Our approach is unique in that it controls all kernel
execution, including interrupt and exception handlers and device drivers, enabling
comprehensive instrumentation of the OS without imposing any overhead on user-level
code. In this paper, we discuss the key challenges in designing and building an
in-kernel DBT framework and how the design differs from user-space.
We use our framework to build several sample instrumentations, including simple
instruction counting as well as an implementation of shadow memory for the kernel.
Using the shadow memory, we build a kernel stack overflow protection tool and a
memory addressability checking tool. Qualitatively, the system is fast enough and
stable enough to run the normal desktop workload of one of the authors for several weeks.
Beom Heyn (Ben) Kim
Unity-VM: a Single system image for an individual PC user
Today, PC users are having difficult time to manage their devices, since
there are usually more than one or two PCs per user unlike a few years
ago. This is because as the technology evolves the price of PCs drops
quickly while the capability of them are significantly enlarged. Although
users can enjoy the convenience of having multiple devices for different
purposes – desktop PC for performance intensive tasks, laptops for
versatility and mobile devices for mobile computing, this brings the
management problems of each device which has different and separately
managed system image including OS, Apps and all user data as well as
To deal with this problem, we are trying to implement the Unity-VM which
is the tool providing a single system image across a user’s PC devices by
migrating a VM containing the whole personal computing environment. By
migrating the entire system image contained in VM along with a user, the
user can work with each of device through a single computing environment
and only have to manage one.
There have been previous works like ISR and The Collective using the VM
migration technology to solve the similar problem. Yet, they utilized
either the central repository to put the suspended image of the machine or
the portable storage devices like USB keys. Most of all, they can't handle
distributed VM image. We believe this limitation is due to the lack of
synchronization mechanism for the VM image including CPU state, memory
state and disk state. With this, currently existing solutions can not
support instant device switch for users. Also, they are relying on the
reliability of the storage medium they are using as the central repository
of the VM image.
To resolve disk consistency issue and support the instant device
switching, Unity-VM keeps track of the location of latest blocks
distributed over multiple devices and support on-demand fetching of the
proper version of data objects. Unity-VM utilizes the centralized
meta-data server, the directory server, for keeping the location
information. Also, we will support automatic replication for each page of
the VM images over the multiple nodes to improve the reliability.
Jettison: Efficient Idle Desktop Consolidation with Partial VM Migration
Idle desktop systems are frequently left powered, often because of
applications that maintain network presence or to enable potential
remote access. Unfortunately, an idle PC consumes up to 60% of its
peak power. Solutions have been proposed that perform consolidation
of idle desktop virtual machines. However, desktop VMs are often
large requiring gigabytes of memory. Consolidating such VMs, creates
bulk network transfers lasting in the order of minutes, and utilizes
server memory inefficiently. When multiple VMs migrate simultaneously,
each VM's experienced migration latency grows, and this limits the use
of VM consolidation to environments in which only a few daily migrations
are expected for each VM. This paper introduces Partial VM Migration,
a technique that transparently migrates only the working set of an
idle VM. Jettison, our partial VM migration prototype, can deliver
85% to 104% of the energy savings of full VM migration, while using
less than 10% as much network resources, and providing migration
latencies that are two to three orders of magnitude smaller.
Billy Yi Fan Zhou
Information Leak Prevention for Android
Mobile devices play an increasingly important role in people’s lives. They are
entrusted with large amounts of private data, but their mobility also makes them
prone to loss and theft. The danger is even greater for mobile devices used in
the corporate or government environment where they hold valuable data and are a
target for espionage. Our research seeks to move the sensitive user data outside
mobile device in order to prevent information leakage during loss or theft.
Our first approach involves running Android on a secure server while streaming
the screen to a mobile device using a custom VNC protocol. Our second approach
involves partitioning Android applications such that the user interface portion
of the application runs on the mobile device while portions of the application
containing sensitive information runs on a server environment.