Jim Larus [ video coming soon ]
December 15, 2016
With the end of Denard scaling and the imminent end of Moore’s Law, there is increasing interest in making better use of the billions of transistors on modern chips, so we can continue to address new and challenging problems that require large amounts of computation. Reconfigurable computing, which uses programming logic devices such as FPGAs, is attracting increased attention. At Microsoft, I helped build Catapult, an FPGA-accelerator for the Bing search engine, which increased its performance two-fold and led to the widespread adoption of FPGAs in Microsoft’s data centers. The process of building Catapult exposed some significant challenges, which are a focus of my research. FPGAs are not a one-for-one replacement for processors, since programming an FPGA is closer to hardware design than software development. Also, FPGAs are low-level devices that need to be integrated into a system, so that data is delivered at a sufficient speed, failures are handled, and tradeoffs involving cost and power are appropriate. And, finally, reconfigurable computing is not appropriate for all problems, so how do you evaluate its potential benefit before building a full system? Widespread adoption of reconfigurable computing awaits better solutions to these challenges. With Ed Bugnion and several students, I am investigating hardware acceleration for genomic data processing. This is a new application domain with large amounts of data and significant computation, an excellent domain to explore solutions to the problems of reconfigurable computing and also to build practical systems to accelerate research in personalized medicine.
Bryan Ford [ video coming soon ]
November 24, 2016
Security-critical online authorities - such as directories, notaries, and software update services - often represent high-value targets through which attacks can indirectly affect millions of devices at once. The DEDIS lab’s Collective Authority or “Cothority” project aims to decentralize critical authorities and split their trust among many - potentially thousands - of independent parties, remaining secure unless many participants collude. As a first step, our collective signing or CoSi protocol strengthens the transparency of existing authorities by ensuring that any authoritative statement is widely validated and logged by diverse witnesses before any client will accept it. We build on CoSi’s scalable signing techniques to create ByzCoin, a blockchain architecture with strong consistency, Byzantine fault tolerance, high throughput, and low transaction latencies. Finally, we outline RandHound and RandHerd, new protocols for scalable bias-resistant distributed randomness that could enable more trustworthy lotteries, elections, and even more scalable sharded blockchains.
Frédéric Kaplan [ video ]
November 17, 2016
I will report on our ongoing investigations on three large-scale cultural heritage datasets: 4 Million Swiss newspapers articles covering a two hundred years period, 1 Million photographs of artworks currently under digitisation at the Cini Foundation and the Venice Time Machine continuously expanding corpora covering documents from a 1000 years period. The Swiss newspaper archives is sufficiently large to test word embeddings methods like Word2Vec, and study how they perform in diachronic contexts for which words progressively change meanings as language itself evolves. On the Artworks databases we are using convolutional neural networks for finding similarity between paintings, engravings, drawings and sculpture and design architectures for efficiently spotting matching details. Eventually, we combine these two approaches to try to crack one of the hardest problem of the Venice Time Machine: the direct projection of graphical forms in semantic spaces without passing through the currently impractical full textual transcription of the digitised documents.
Edouard Bugnion [ video ]
October 27, 2016
The conventional wisdom is that aggressive networking requirements, such as high packet rates for small messages and μs-scale tail latency, are best addressed outside the kernel, in a user-level networking stack. This talks presents IX, a dataplane operating system that provides high I/O performance and high resource efficiency while maintaining the protection and isolation benefits of existing kernels. IX uses hardware virtualization to separate management and scheduling functions of the kernel (control plane) from network processing (dataplane). The dataplane architecture builds upon a native, zero-copy API and optimizes for both bandwidth and latency by dedicating hardware threads and networking queues to dataplane instances, processing bounded batches of packets to completion, and by eliminating coherence traffic and multi-core synchronization. The control plane dynamically adjusts core allocations and voltage/frequency settings to meet service-level objectives. I will close with some open research questions that my group is actively working on.
Friedrich Eisenbrand [ video | slides ]
November 22, 2012
Our research at DISOPT focuses on the design and analysis of algorithms for discrete optimization problems. Discrete optimization problems are ubiquitous and often a serious hurdle to take on the way to an efficient tool. We contribute with theoretical results as well as with the application of our research to real-word problems. I will present some of our results in this area and describe some current challenges that drive us.
Michael Gastpar [ video | slides ]
May 31, 2012
Classical experiments in neuroscience have measured the response of a single neuron to a stimulus or a class of stimuli. Over the past decades, however, large datasets have emerged owing to new measurement techniques such as implanted tungsten microwire arrays and ECG. In classical studies, the natural problem is one of system identification, namely, the response characteristics of a single neuron. With contemporary data sets, simultaneously involving multiple neurons, many more questions can be asked, pertaining not only to the response characteristics, but also to the network behavior of the neurons. Fundamental properties like redundancy in neuron populations, causality and graph structure in networks of neurons, and functional relationships between multiple neurons can be estimated from data. Methods are emerging that can deal with and exploit the advantages of large amounts of measurement data from multiple neurons. In this talk, we discuss statistical methods based on information measures such as mutual information, directed information, "information bottleneck," and Shannon-optimal communication.
Babak Falsafi [ video | slides ]
May 24, 2012
Information technology is now an indispensable pillar of a modern-day society, thanks to the proliferation of digital platforms in the past several decades. The demand on robust and economical data processing and storage, however, is growing faster than technology can sustain. Moreover, while forecasts indicate that chip density scaling will continue for another decade, the diminishing returns in supply voltage scaling and the impending “energy wall” is leading server designers towards energy-centric solutions. In this talk, I will motivate the end of energy scaling in servers from parallelism alone, and then project promising research avenues to help scale beyond existing chip architectures based on emerging cloud and data-center workloads.
Matthias Grossglauser [video | slides ]
February 23, 2012
Our group is broadly interested in models and algorithms for dynamic networks. Interesting dynamics arise, for example, as the result of discovery and sampling processes on networks, of network formation and evolution, and of the mobility of wireless nodes. We give an overview of several problems in this area. First, we seek methods for the efficient forwarding and routing of messages in mobile wireless networks, whose channels and topology are unpredictable and dynamic. Second, we provide an explanation for the phenomenon of network densification in many natural and technical networks. Third, we discuss network privacy, where an adversary tries to break node anonymity through structural side information.
Christina Fragouli [ video | slides ]
December 15, 2011
In my lab, we are interested in how we can reliably, efficiently and securely transfer information over networks. In network coding, in contrast to current practice, network nodes code across independent information streams, even if these streams are intended for different receivers. This paradigm shift drives us to rethink fundamental principles of information flow in networks, and raises exciting research directions; we will give examples of the challenge (and fun) involved in looking at such problems.
Anastasia Ailamaki [video | slides ]
December 1, 2011
We are in the era of data deluge: Everyone complains about how much data they have and how they feel there's useful information hidden within, but they are not sure what it is or how to get it. The DIAS Squad comes to the rescue in two ways. Firstly, through building efficient data management systems, we harness the underlying hardware's storage and compute capabilities. We redesign data management software with hardware particularities in mind, but at the same time we explore radically controversial approaches, such as in-situ querying of data without first loading it into the system. Secondly, through systematic communication with other disciplines, we design infrastructure that enables efficient exploration of scientific data which leads to unprecedented discoveries. We can help domain scientists to quickly find the effect of underground seismic activity in a city, or quickly determine which are the next possible areas of interest while navigating the neurons of the human brain. By combining deep introspective systems research with reaching-out to demanding applications we aspire to bring the scientific method closer to the fourth paradigm.
Viktor Kuncak [video | slides ]
November 17, 2011
Programming is harder than it should be. To make it easier, we propose implicit programming, a declarative programming model where automated reasoning plays a key role during: (1) interactive program development; (2) program compilation; and (3) program execution.
Energy-Proportional Networked Systems
November 3, 2011
Energy consumption of the Internet is already substantial and it is likely to increase as operators deploy faster equipment to handle popular bandwidth-intensive services, such as streaming and video-on-demand. Further, the CMOS technology that is predominantly used in the ICT equipment is reaching its energy efficiency limits. The net effect of these issues is that the Internet will shortly hit the power delivery and cooling limits while the hardware is trying to sustain the exponentially increasing traffic demand. In this talk, I will describe our efforts in designing energy-proportional networks that mitigate this problem by trying to use the minimal amount of energy to satisfy each offered load (traffic).
First, looking at the energy breakdown, I will concentrate on greening access networks. I will describe how we use traffic aggregation over wireless user gateways and switching at the ISP side to achieve substantial energy savings. The resulting powering off of gateways permits the remaining ones to synchronize at higher speeds due to reduced crosstalk - a surprising performance benefit. Second, I will focus on backbone and datacenter networks that are starting to reach power delivery limits. Existing approaches that recompute the optimal network configuration with each substantial change in traffic demand do not scale. To overcome the optimality-scalability trade-off, I will describe REsPoNse, an approach that precomputes a few energy-critical paths and uses online traffic engineering to redirect the traffic in a way that enables large parts of the network to enter a low-power state.
Learning from Data: Why Probability Matters
Matthias Seeger [ video | slides ]
October 20, 2011
Machine learning seeks to automate the processing of large complex datasets by adaptive computing, a core strategy to meet rapidly growing demands from science and applications. In order to successfully learn from data, we literally have to give up knowing (exactly) what we are doing, but instead quantify what we don't know and reason about our uncertainties. I will use an example from medical imaging to show how adaptive Bayesian computing and decision making from uncertain knowledge can be used in order to sequentially optimize the sampling of image bitmaps, with the goal of speeding up magnetic resonance image acquisition.
George Candea [video | slides ]
May 26, 2011
When going from a small program to a real system, new fundamental challenges arise that cannot be addressed with the techniques that work at small scale. This talk will describe some of the problems that systems researchers have been working on for decades, along with some of my favorite ideas that came out of systems research.
Sampling in the Age of Sparsity
May 12, 2011
Sampling is a central topic not just in signal processing and communications, but in all fields where the world is analog, but computation is digital. This includes sensing, simulating, and rendering the real world, estimating parameters, or using analog channels.
The question of sampling is very simple: when is there a one-to-one relationship between a continuous-time function and adequately acquired samples of this function? Sampling has a rich history, dating back to Whittaker, Nyquist, Kotelnikov, Shannon and others, and is an active area of contemporary research with fascinating new results. Classic results are on bandlimited functions, where taking measurements at the Nyquist rate is sufficient for perfect reconstruction. These results were extended to shift-invariant and multiscale spaces during the development of wavelets. All these methods are based on subspace structures, and on linear approximation. Irregular sampling, with known sampling times, relies of the theory of frames.
These classic result can be used to derive sampling theorems related to PDE's, to mobile sensing and as well as to sampling based on timing information. Recently, non-linear sampling methods have appeared. Non-linear approximation in wavelet spaces is powerful for approximation and compression. This indicates that functions that are sparse in a basis (but not necessarily on a fixed subspace) can be represented efficiently. The idea is even more general than sparsity in a basis, as pointed out in the framework of signals with finite rate of innovation. Such signals are non-bandlimited continuous-time signals, but with a parametric representation having a finite number of degrees of freedom per unit of time. This leads to sharp results on sampling and reconstruction of such sparse continuous-time signals, leading to sampling at Occam's rate.
Among non-linear methods, compressed sensing and compressive sampling, have generated a lot of attention. This is a discrete time, finite dimensional set up, with strong results on possible recovery by relaxing the l_0 into l_1 optimization, or using greedy algorithms. These methods have the advantage of unstructured measurement matrices (actually, typically random ones) and therefore a certain universality, at the cost of some redundancy. We compare the two approaches, highlighting differences, similarities, and respective advantages.
We finish by looking at selected applications in practical signal processing and communication problems. These cover wideband communications, noise removal, distributed sampling, and super-resolution imaging, to name a few. In particular, we describe a recent result on multichannel sampling with unknown shifts, which leads to an efficient super-resolution imaging method.
Product Recommenders for Online Users
May 5, 2011
My research is multi-disciplinary and focuses on issues in the intersection of human computer interaction, artificial intelligence, and behavioral decision theories. In my earlier work, I considered the interaction of artificial intelligence and visualization techniques in various decision support systems, for example for aircraft allocation and network configuration.
In the last few years, I have focused my attention on designing product recommenders in online environments. First I present product recommenders as a special case of multi-criteria utility problems: how to help a user find the most preferred item in a large set of options. Then i focus on the main user challenge in such recommender systems, which is users' expectation of high decision quality while unwilling or unable to accurately state their preferences. To be able to solve this problem is crucial in motivating users to accept items recommended to them, especially in e-commerce environments. I will describe the user task analysis process that has guided us in coming up an innovative solution, called Example Critiquing. I will highlight the design features of Example Critiquing and show how it is sensible to users' behaviors and needs. I will also present some major outcomes of validating the system with real users.
Through the presentation of this research work, I hope to demonstrate the results and fun of combining user experience research from HCI and intelligent techniques from AI to achieve interactive intelligence: how humans and machines can combine their abilities to solve difficult problems.
Scala – How To Make Best Use of Functions and Objects
April 21, 2011
For a long time my research has been on how to unify functional and object-oriented programming. This research has led to the Scala language, which combines both technologies while staying completely interoperable with Java. It allows a terseness of style comparable to scripting language but has at the same time a strong and static type system. In this way, it can express common programming patterns in a concise, elegant, and type-safe way. Scala is also a great vehicle for embedding other, domain-specific languages as high-level libraries.
In this talk I will give an introduction to Scala, highlighting its main features and how they lead to an integration of objects and functions. I will then talk about some of our research on embedding parallel domain-specific languages so that they can be executed on massively parallel hardware.
Mean Field Methods for Computer and Communication Systems
March 3, 2011
We consider a generic model of N interacting objects, where each object has a state and the evolution of the system depends only on the collection of states at any point in time. This is quite a general modeling framework, which was successfully applied to model many forms of distributed systems and communication protocols. When the number of objects N is large, one often uses simplifying assumptions called "mean field approximation", "fluid approximation", "fixed point method" or "decoupling assumption". In this talk we explain the meaning of these concepts and show that the first two, namely mean field approximation and fluid approximation, are generally valid (but not always). However, we also show that the last two, namely fixed point method and decoupling assumption, require more care, as they may not be valid, even in simple cases. We give sufficient conditions under which they are valid. We illustrate the concepts with some easy-to-follow examples.
Beautiful, But Is It Useful?
December 9, 2010
This talk will cover some of the history of the field of information theory and then describe an example of an information theory problem: how to use multiple antennas to increase the capacity of the corresponding wireless channel. In particular, I will explain why using multiple transmitting and/or receiving antennas for single-user communication yields (perhaps surprisingly) significant gains over single-antenna systems. This explains, in part, why modern WiFi access points sprout multiple antennas.
Rethinking the Foundations of Databases
November 25, 2010
Contemporary database query languages are ultimately founded on logic and feature an additive operation – usually a form of (multi)set union or disjunction – that is asymmetric in that additions or updates do not always have an inverse. This asymmetry puts a greater part of the machinery of mainstream mathematics for equation solving outside the reach of databases. However, such equation solving would be a key functionality that problems such as query equivalence testing, the view update problem, and data integration could be reduced to: In the presence of an asymmetric additive operation they are undecidable. Moreover, query languages with a symmetric additive operation (i.e., which has an inverse and is thus based on ring theory) would open up databases for a large range of scientific applications that are currently the domain of computer algebra systems. This talk motivates an effort to reinvent database management systems with a foundation in abstract algebra and specifically in ring theory. The presence of an additive inverse allows to cleanly define differences between queries. This gives rise to a database analog of differential calculus that leads to radically new incremental and adaptive query evaluation algorithms that substantially outperform the state of the art techniques. These algorithms enable a new class of systems which I call Dynamic Data Management Systems. Such systems can maintain continuously fresh query views at extremely high update rates and have important applications in interactive Large-scale Data Analysis. There is a natural connection between differences and updates, motivating the group theoretic study of updates that will lead to better ways of creating out-of-core data processing algorithms for new storage devices. Basing queries on ring theory leads to a new class of systems, Algebraic Data Management Systems, which herald a convergence of database systems and computer algebra systems.
From Artificial Intelligence to Game Theory
November 11, 2010
Artificial intelligence has its biggest impact where it extends people's natural intelligence to the modern world, which is largely artificial. I give examples of selected results that I have obtained with my students, including automated engineering design, network routing and tools for electronic commerce. I then elaborate in more detail on more recent work on multi-agent systems that marry artificial intelligence with game theory.
Countering Denial of Service (And Why It's Hard)
October 28, 2010
One of the toughest problems in network systems research is dealing with denial-of-service attacks, where a large number of compromised hosts send unwanted traffic to a receiver in order to exhaust its resources and disrupt its communications. Despite more than a decade of research and a slew of counter-measure products, this remains a serious, unsolved problem for the Internet. I will use the denial-of-service topic as a backdrop for describing one of the fundamental challenges of network system design: how to maintain and manage state. I will (de)construct different denial-of-service counter-measures, describe where each one keeps its state and how it manages it, and discuss why every measure that stops one form of attack opens the door to another. This conundrum is an illustration of the kind of problems network systems researchers aim to tackle.
Modeling Brain Circuitry using Scales Ranging from Micrometer to Nanometer
October 14, 2010
Electron microscopes (EM) can now provide the nanometer resolution that is needed to image synapses, and therefore connections, while Light Microscopes (LM) see at the micrometer resolution required to model the 3D structure of the dendritic network. Since both the arborescence and the connections are integral parts of the brain's wiring diagram, combining these two modalities is critically important. In this talk, I will therefore present our approach to building the dendritic arborescence and to tracking migrating neurons from LM images, as well as to segmenting intra-neuronal structures from EM images.
The Development of Raptor Codes
September 29, 2010
Raptor Codes are a class of fountain codes with very efficient encoding and decoding algorithms. They are being successfully used today in applications where data has to be transmitted on an unreliable network from one or multiple senders to one or multiple receivers. Versions of these codes have been standardized by several standards bodies, including 3GPP, DVB, and various IPTV standardization committees. In this talk, I will explain some of the main ideas behind the design and analysis of these codes. The talk will cover all major versions of Raptor codes which have been developed for commercial applications in the last 10 years, and will also touch upon RaptorQ, the newest Raptor code in the family which is being productized and standardized by Qualcomm, Inc.
Modest Computing: Simple Groupware with Significant Effects
June 3, 2010
During the last KTN talk, Sabine explained how invisible light may improve our perception of visible objects. I will continue along the same lines. Demonstrating some of our work concerning augmented teamwork, I will present tools that make visible some features of team interactions that would otherwise remain invisible. The Reflect table shows to a team how much each member has been speaking. It is really not smart but studies revealed that it has a significant effect on the interaction of the team. The Lantern is a small lamp that shows how much each student has been waiting during an exercise session before an assistant becomes available. It is really not that smart either but studies revealed that it reduces wasted time from 62% to 6%. Finally, using two synchronized eye trackers, we show to player A what player B is looking at when they play Tetris together. Studies revealed that it was in fact detrimental to teamwork in Tetris since the game is just too fast. We are currently performing the same experiments for peer programming in Eclipse. We call these tools "modest" because they don't aim to predict users' behavior or to guess what users need. Instead, they have a minimalist goal: make visible some elements that would otherwise remain invisible. This research also concerns hardware. Computers are not only these ugly rectangular boxes, but can be embedded in everyday objects, like tables or lamps, which remain discreet, in the background. I will show examples of interfaces based on simple tangible objects, which are more effective than multi-touch, and interfaces made of simple sheets of paper that are effective tools for controlling a complex simulation.
Improving the Visible with the Invisible
May 20, 2010
In the past, development in digital photography has resulted in cameras and associated image processing that made images look as "good" as film did. Only in the last few years did research start to address the fundamental differences and opportunities given by the nature of the digital signals we are working with. Computational photography, a term first used by Steve Mann at MIT to describe his research, encompasses a number of research projects in sensor design, illumination, optics, and processing that go beyond the paradigm of “just” exchanging the film for a sensor and exploit the inherent advantages of digital systems. In this talk, the different research directions in computational photography are briefly illustrated, followed by our current approach to the topic. We exploit that silicon-based digital camera sensors exhibit significant sensitivity beyond the visible spectrum (400-700nm). They are able to capture wavelengths up to 1100 nm, i.e., they are sensitive to near-infrared (NIR) radiation. This additional information is conventionally treated as noise and is absorbed by a NIR-blocking filter affixed to the sensor. We show that retaining instead of removing NIR information can significantly improve certain image processing and computer vision tasks. Indeed, intrinsic properties of the NIR wavelength band guarantee that images can be sharper, less affected by man-made colorants, and more resilient to changing light conditions. We illustrate the benefits of using NIR images in conjunction with standard color images in applications such as de-hazing, single and multiple illuminant detection, shadow detection, material classification, and skin rendering. The design of a camera than can simultaneously capture visible and NIR information on one single sensor will also be briefly discussed.
May 12, 2010
Join me on a visual journey through one of our ongoing research projects. I will first give a brief overview of the field of computer graphics and present some of research topics of the newly founded Computer Graphics and Geometry Lab. The main part of the talk will focus on one specific project in human performance capture and animation. Imagine you could control an arbitrary digital avatar! Your motion, gestures, and expressions are automatically transferred in realtime to your favorite superhero or cartoon character, or a (possibly enhanced) digital representation of yourself. Realizing such a system requires solving a number of fundamental research challenges in realtime 3D acquisition, human motion modeling, and animation transfer. I will report on our ongoing progress towards this goal, demonstrate some of our current results, and discuss research problems that are still unsolved. Once they are solved, fascinating new applications will become possible, e.g., in social interactions and human communication, computer gaming, performance art, or education.
From Biology to Learning Algorithms and Back: Distributed Local Rules for Behavioral Learning
May 6, 2010
I am interested in how the brain works. One of the most fascinating aspects of brain function is its capacity to learn from experience. In the brain, there are billions of neurons with a high degree of connectivity: each neuron receives input from 1,000 to 10,000 other neurons. Learning is implemented as a change in the weight or efficiency of the connections. From a formal point of view, learning is nothing more than optimizing some function. However, in order for an optimization algorithm to be realizable in biology, there are a couple of constraints, the first and most obvious one being that each connection can only use information that is locally available. What kind of learning problems can be solved under this condition? I review the basic background material in computational neuroscience and try to convey the big open problems that drive the field.
Understanding Evolution Through Computation
April 15, 2010
"Nothing makes sense in biology except in the light of evolution" was the title of a tutorial by the famous pioneering biologist Th. Dobzhansky. Understanding evolution itself has thus become a major research focus in the field. At the molecular level, increasing numbers of high-throughput instruments and pipelines are producing very large amounts of data, such as whole-genome sequences, genetic profiles, expression levels, etc. These data can be used to refine our understanding of evolutionary mechanisms through a process of model development, algorithm design, simulation, and testing. Research at the LCBB focuses on this process, with emphasis on algorithmic solutions. In this talk, I will introduce the modelling and computational problems arising from the study of evolution at the molecular level, present some of our recent results, and discuss one of our "grand challenge" problems.)
Distributed Computing: A Consensus Perspective (Three-Act Play)
March 25, 2010
Agreement between multiple parties is a difficult problem when the participants or their communication medium may experience failures or involves liars. The solution is an algorithm called Paxos by Lamport, Viewstamped Replication by Oki/Liskov, Rotating Coordinator by Chandra/Toueg, or Partially Synchronous Consensus by Lynch/Dwork/Stockmeyer. This talk will present a simple version of this algorithm.