Parallel Programming through Dependence Analysis – Part I

 “As soon as an Analytical Engine exists, it will necessarily guide the future course of the science. Whenever any result is sought by its aid, the question will then arise — by what course of calculation can these results be arrived at by the machine in the shortest time?”

Charles Babbage (1864)

Points to Ponder

Would it not be wonderful, if we could write all our simulations as serial programs, and parallelized code (highly optimized for any given supercomputer) would be generated automatically by the compiler? Why is this not the case today? How come supercomputing centers require teams of highly trained developers to write simulations?


Scientists around the world develop mathematical models and write simulations to understand systems in nature. In many cases, simulation performance becomes an issue either as datasets (problem size) get larger, and/or when higher accuracy is required. In order to resolve the performance issues, parallel processing resources can be utilized. Since a large number of these simulations are developed using high level tools such as Matlab, Mathematica, Octave, etc., the obvious choice for the scientist is to use the parallel processing functions provided within the tool. A case in point is the parfor function in Matlab, which executes iterations of a for-loop in parallel. However, when an automation tool fails to parallelize a for-loop, it can be hard to understand why parallelization failed, and how one might change the code to help the tool with parallelization. Continue reading

Class participation incentive using e-tokens.

For the past two years I was the lab assistant for the “Information Systems Design and Implementation — Programming in Java” course, taught by my PhD supervisor Prof. Diomidis Spinellis at the Athens University of Economics and Business. To make the lesson more interesting and give an extra motivation to the students, me, Vassilios Karakoidas and Diomidis decided to distribute e-tokens to the students that actively participated. In return, the students were offered the possibility to better their grades by the end of the semester. In this post I will describe how we did this and I will provide some initial results based on the students feedback.

Continue reading

Programming != Computer Science

For most students, computer science means lots of high-level coding, screens with black backgrounds and green text, and an esoteric subject. When students hear the term computer science, many think about programming languages – Java, C++, Python to name a few. However, what those students are really thinking about is computer programming, an extension and application of computer science. Computer science uses code and programming languages and different numerical systems, but computer science itself is the study of logic, efficiency, and problem solving. With that, it is worth examining what the world of computer science truly encompasses and what purposes it serves to study computer science.
Continue reading

API Evaluation

API is the initials of “Application Programming Interface”. APIs are bundles of interfaces that developers must implement to build their applications. Common APIs are the Java, Python, and Ruby APIs, as well as the Android, iOS APIs and many other third-party libraries (e.g. jQuery and Google maps). Except for their source code, APIs come with their documentation. Then, client developers, from different programming levels, read this documentation to build distinct applications that use the same APIs. This means that an API should be unambiguous and useful in order to prevent developers from writing applications susceptible to crashes.

Continue reading

Open Source in Education

As mentioned in a previous post by Maria Kechagia, “Free, Libre and Open Source Software (FLOSS) is software released under a license that allows developers to: 1) access the software’s source code, 2) use the software for free, and 3) develop derived works based on software’s previous releases.” FLOSS software is very versatile and can be used for a variety of purposes. One such purpose is education and institutional learning, a world in which software and technology is prevalent. However, at the present day institutions of learning (i.e., from high school up to university) mostly use commercial software such as Microsoft Office and Adobe Photoshop. Meanwhile, open source software such as OpenOffice, GIMP, and Linux offer the same capabilities as their commercial counterparts, yet are not employed as much. Weighing the options, it becomes clear that educational institutions should move to using more FLOSS software. Why? Let’s review the benefits for students.
Continue reading

How Connected are Quantum Graphs?

Some students in my department this quarter hosted a reading group on quantum computing. Quantum computing is becoming more and more relevant and the topic attracted the participation of a diverse group of researchers. The best way to handle the scope of the topic and diversity of the participants was to invite volunteer speakers to give talks on the quantum analog of their own research area — “Quantum Circuit Lower Bounds,” “Quantum Game Theory,” and “QPSPACE” were among some of the topics. Naturally, I saw this as a great opportunity to understand more about quantum spectral graph theory. In this post I will outline some basic definitions and properties of quantum graphs, and as a follow up to my previous post on the connections between spectral geometry and graph theory, discuss isospectral properties of discrete and quantum graphs. Continue reading

Why to Get Involved in the Open Source Community?

Free, Libre and Open Source Software (FLOSS) is software released under a license that allows developers to: 1) access the software’s source code, 2) use the software for free, and 3) develop derived works based on software’s previous releases. FLOSS’s success can be attributed to the motivations of the individuals that are members of the open source community. However, FLOSS’s recent boom is also associated with the adoption of many business models, which rely on FLOSS, by modern companies. The aim of this post is to summarize the reasons that make individuals and companies to participate in the open source community and highlight the impact of FLOSS projects on computer science. To name some of the most popular FLOSS projects, consider: the Eclipse IDE, the Firefox browser, the Apache server, the Linux kernel, the Java, C/C++, Python programming languages, the MySQL, SQLite, PostgreSQL databases, the GNOME and KDE desktop environments, the Git version control system, the R-Project for statistical computing, the TeX system for publishing, etc.

Continue reading

The Geometric Origins of Spectral Graph Theory

Spectral graph theory is the study of the intimate relationship of eigenvalues to different properties of graphs. Most typically the eigenvalues are associated to eigenfunctions of matrices associated to graphs known as Laplacians. Let $latex G = (V, E)$ be a weighted or unweighted undirected graph (there are simple extensions to directed graphs for many problems). Let $latex A$ be the adjacency matrix and let $latex D$ be the diagonal degree matrix, that is, $latex (D)_{ii} = d_i$, the degree of vertex $latex i$. Then a common definition for the Laplacian is $latex L = D-A$.

As I develop in my research in spectral graph theory, I am consistently amazed by the truth that many results in spectral graph theory can be seen as discrete analogues to results in spectral geometry. I am not accustomed to thinking of graphs as geometric objects, but in fact graph Laplacian matrices are nicely related to Laplace operators on Riemannian manifolds. In this post, I’d like to discuss a few of these relationships. Continue reading

Language Bureaucracy

Laziness, impatience and hubris are the three great virtues that each programmer should have, at least according to Larry Wall [1]. My experience so far showed me that he was right. All programmers have these characteristics, if they do not, usually they are not real programmers. Since they are expressing these values with the usage of several programming languages, they tend to compare them. Usually this comparison ends up with a phenomenon called flame wars. The programmers are participating in endless quarrels, exchanging arguments regarding language features, their standard (or not) libraries, etc. Continue reading

Big Data, Communication and Lower Bounds

As the size of the available data increases, massive data sets cannot be stored in their entirety in memory of a single machine. Furthermore due to the small amount of memory and computation power available on a single node, one needs to distribute the data and computation among multiple machines when processing tasks.

However, transferring the big data is very expensive. In fact, it is more expensive than the computations on the datasets. Thus, in the distributed model, the amount of communication plays an important role in the total cost of an algorithm and the aim is to minimize the amount of communication among processors (CPUs). This is one of the main motivations to study the theory of Communication Complexity, which originates from Big Data processing. Continue reading