Date
|
Presenter
|
Topic
|
Feb 1
|
Madalin Mihailescu
|
MixApart: Decoupled Analytics for Shared Storage Systems
We present MixApart, a scalable data processing framework for shared
enterprise storage systems. With MixApart, a single consolidated storage
back-end manages enterprise data and services all types of workloads, thus
simplifying data management and lowering hardware costs for enterprises.
In addition, MixApart enables the local storage performance required by
data analytics through an integrated data caching and scheduling solution.
Presenter Bio:
Madalin Mihailescu has recently defended his PhD thesis in the
Department of Computer Science, at UofT, under the supervision of
Professor Cristiana Amza. He is currently employed with NetApp, Boston.
His PhD research focused on the design of a new Hadoop-like MapReduce
framework allowing the flexibility of scaling the compute tier
independently from the storage tier, by placing the compute tier
in a cache which could be deployed anywhere, including on the Cloud.
|
Mar 8
|
James Zhen Huang
|
Ocasta: Precise and Efficient Configuration Recovery for Desktop Applications
Configuration errors are a leading cause of desktop system failures.
To recover from a configuration error, end-users usually choose to use
snapshot recovery, or reset recovery (reset all configuration settings
of the software having the error to default values). However, all
these solutions cause a challenging problem: collateral damage (user
losing configuration changes unrelated to the configuration error).
Ocasta addresses this problem with two novel techniques: configuration
setting clustering and semi-automatic search. We evaluated Ocasta with
traces of real application usage collected from 24 Linux desktops and
5 Windows desktops. Ocasta successfully fixed all 18 injected
configuration errors for 10 Linux and Windows desktop applications,
with an average of 87% and 95% less collateral damage as compared to
snapshot recovery and reset recovery, respectively.
Presenter Bio:
Zhen Huang is a fourth year PhD student working with Professor David
Lie. He finished his Masters degree under the supervision of Professor
David Lie at University of Toronto in 2009. In his past life, he
worked as a senior software developer at EMC, SAP Canada, and Bank of
China. His research interests include software reliability, software
security, and operating system.
|
Mar 22
|
Gennady Pekhimenko
|
Base-Delta-Immediate Compression: Practical Data Compression for On-Chip Caches
Cache compression is a promising technique to increase on-chip
cache capacity and to decrease on-chip and off-chip bandwidth
usage. Unfortunately, directly applying well-known compression
algorithms (usually implemented in software) leads to high hardware
complexity and unacceptable decompression/compression latencies,
which in turn can negatively affect performance. Hence,
there is a need for a simple yet efficient compression technique that
can effectively compress common in-cache data patterns, and has
minimal effect on cache access latency.
In this work, we introduce a new compression algorithm called
Base-Delta-Immediate (BΔI) compression, a practical technique
for compressing data in on-chip caches. The key idea is that, for
many cache lines, the values within the cache line have a low
dynamic range Δ i.e., the differences between values stored within
the cache line are small. As a result, a cache line can be represented
using a base value and an array of differences whose combined size
is much smaller than the original cache line (we call this the base+delta
encoding). Moreover, many cache lines intersperse such base+delta
values with small values - our BΔI technique efficiently incorporates
such immediate values into its encoding.
Compared to prior cache compression approaches, our studies show that
BΔI strikes a sweet-spot in the tradeoff between compression ratio,
decompression/compression latencies, and hardware complexity.
Our results show that BΔI compression improves performance for both
single-core (8.1% improvement) and multi-core workloads (9.5% / 11.2%
improvement for two/four cores). For many applications, BΔI provides
the performance benefit of doubling the cache size of the baseline system,
effectively increasing average cache capacity by 1.53X.
Presenter Bio:
Gennady Pekhimenko is a third year PhD student in the Computer Science
Department at Carnegie Mellon University, working with Professor Todd C.
Mowry and Professor Onur Mutlu. Before that (2008-2010), he worked at IBM
Toronto Lab (Compilers Group) under supervision of Dr. Yaoqing Gao.
He received his MSc. in Computer Science in 2008 from the University of
Toronto, where he was advised by Professor Angela Demke Brown; and his
B.S. in Applied Mathematics and Computer Science from Moscow State
University in 2004. His research interests are in computer architecture
with emphasis on efficient memory hierarchy designs. Gennady's work is
funded by Microsoft Research and NSERC CGS-D fellowships.
|
May 24
|
Thomas Sewell, University of New South Wales, Sidney Australia
|
Verifying an OS Kernel down to the Binary Level
Bugs in crucial software systems are a constant, unfortunate nuisance. Formal
verification is an attractive theoretical solution to the problem, but one
which has been applied to only a tiny number of real programs. This talk begins
with a recap of one such verification: the proof of functional correctness of
the seL4 microkernel. We will take a brief look at the big picture of this
proof and some aspects of seL4's design that made the problem tractable.
The functional correctness proof for seL4 initially assumed the correctness of
the compiler, hardware control routines and initialization code. The body of
this talk will focus on recent work which addresses the compiler correctness
assumption by showing that the semantics of the compiled binary matches the
semantics of the C source which was assumed in the earlier proof. We target the
Cambridge ARM semantics, gcc (4.5.1) at optimization level 1, and the
previously-verified parts of seL4. We use a combination of proof-producing
conversions in Isabelle/HOL and HOL4 and a custom correspondence finder for
assembly-like programs powered by modern SMT solvers.
Presenter Bio:
Thomas is a PhD student at the University of New South Wales in Sidney Australia. He is working on extensions to the verification of the seL4 microkernel. He was a core team member of the L4.verified project that proved the functional correctness of the kernel. Since the completion of that project he has extended the functional correctness proof to include the fastpath and worked on an access control model for seL4. He is now working on eliminating the compiler correctness assumption from the seL4 proof by establishing that the compilation was faithful to the assumed semantics of the C code. His upcoming PLDI paper, Translation validation for a verified OS kernel, can be found here:http://www.nicta.com.au/pub?doc=6449
|
May 31
|
Nosayba El-Sayed
|
Reading between the lines of failure logs: Understanding how HPC systems fail
As the component count in supercomputing installations continues to increase,
system reliability is becoming one of the major issues in designing HPC systems.
These issues will become more challenging in future Exascale systems, which are
predicted to include millions of CPU cores. Efficiently running those systems will
require a good understanding of how different factors impact system reliability.
In this work we use a decade worth of field data made available by Los Alamos
National Lab to study the impact of a diverse set of factors on the reliability of HPC systems.
We provide insights into the nature of correlations between failures, and investigate the
impact of factors, such as the power quality, temperature, fan and chiller reliability, system
usage and utilization, and external factors, such as cosmic radiation, on system reliability.
Nosayba is a third year PhD student in the department of Computer Science working
under the supervision of Professor Bianca Schroeder. Her research focuses on collecting
and analyzing large-scale field data from production machines and applying rigorous statistical
and analytical methods to explore opportunities for reliability and energy-efficiency improvements.
|