# 2019 Projects

## Analysis (AN) Group:

Project Title: Fractal reconstruction from distribution sampling

Project Description:

A fractal is a set with the property of being similar to scaled versions of itself. The ternary Cantor set is a well-known fractal. It is constructed from the unit interval by recursively removing the middle third interval, e.g. $$C_0 = [0,1]$$, $$C_1 = [0,1/3] \cup [2/3,1]$$, $$C_2 = [0,1/9] \cup [2/9,1/3] \cup [2/3,7/9] \cup [8/9,1]$$, etc. The limit of this procedure is the ternary Cantor set. The ternary Cantor set belongs to a larger class of Cantor sets, each constructed in the same manner with the exception that the intervals which are removed vary, e.g. removing the first and the third quarter-intervals where $$D_0 = [0,1]$$, $$D_1 = [1/4,1/2] \cup [3/4,1]$$, etc. The number of subdivisions of each interval and those sub-intervals which are removed are referred to as the parameters of the Cantor set which the aforementioned procedure generates. The cumulative distribution function (CDF) for a Cantor set is a function on the unit interval which describes the size of the Cantor set. For example, the value of the CDF of the ternary Cantor set is one-half at $$x = 1/3$$ to illustrate that half of the ternary Cantor set belongs to $$[0,1/3]$$. In fact, the CDF of the ternary Cantor set is the Devil's staircase, depicted below.

A sampling of the CDF means a finite collection of coordinate pairs from its graph. In this project, we aim to determine the parameters of a Cantor set from sampling its CDF. More generally, we will consider probability distributions that are supported on fractals, and we will attempt to approximate the CDF from sampling.

Prerequisites:

• Linear Algebra
• Probability is helpful but not necessary

## Data Science (DatSci) Group:

Project Title: Solving systems of equations from distributed data

Project Description:

Solving large systems of linear equations (expressed in matrix form as $$A \vec{x} = \vec{b}$$ ) is foundational to data analysis. In fact, many statistical models, which include linear, ridge regression and kernel-based, are built by solving systems of linear equations. Building such models requires access to all of the involved data. Often, this is not possible, especially when processing large spatiotemporal data sets. For example, resource limitations such as memory constraints give rise to data sets which are distributed across multiple servers. Efficiently analyzing distributed data then requires novel algorithms.

This project will consider the problem posed by distributed data sets. Specifically, we consider a system of linear equations given in the matrix form above, where the rows of the matrix A are indexed not by integers, but by the vertices of a graph. Here, the vertices of the graph represent nodes in a computer network, where each node possesses some (but not all) of the equations in the system. Supposing that the vertices are able to communicate with each other, but may not share their equations with each other, we wish to know how the solution -- or some suitable approximation -- to the system of equations can still be found. To this end, we consider the Kaczmarz algorithm, an iterative method for solving systems of equations. With small adjustments, this algorithm provides a mechanism for solving distributed systems of equations. The goal of this project will be to understand the rate of convergence of this adjusted algorithm and how that rate may be accelerated.

Prerequisites:

• Linear Algebra
• Programming experience is helpful but not required

## Modeling and Computation (ModCom) Group:

Project Title: Numerical solution of partial differential equations via the Radon transform

Project Description:

The Radon transform (RT) of a two-dimension function, $$f(x,y)$$, is given by the following formula: $\widehat{f}(\omega,s) = {\mathcal R}\left( f \right) := \int_{-\infty}^{\infty} f\left( s \cos(\omega) - z \sin(\omega), \, s \sin(\omega) + z \cos(\omega) \right) \, dz,$ where the independent variables on the transformed side, $$( \omega,s )$$, are related to the original independent variables, $$(x,y)$$, via the following mapping: $x(s,z; \omega) = s \cos(\omega) - z \sin(\omega) \quad \text{and} \quad y(s,z; \omega) = s \sin(\omega) + z \cos(\omega).$ Thus $$s$$ and $$z$$ represent orthogonal coordinates in a new coordinate system that is rotated by an angle $$\omega$$ relative to the original coordinates $$(x,y)$$; the function $$f(x,y)$$ is then integrated along $$z$$ to obtain the transformed function $$\widehat{f}(\omega,s)$$.

One critical feature of the Radon transform is that partial derivatives of $$f(x,y)$$ with respect to $$x$$ and $$y$$ both transform to derivatives with respect to $$s$$ of the Radon transform $$\widehat{f}(\omega,s)$$: ${\mathcal R}\left( \frac{\partial f}{\partial x} \right) = \cos(\omega) \frac{\partial \widehat{f}}{\partial s}(\omega,s) \quad \text{and} \quad {\mathcal R}\left( \frac{\partial f}{\partial y} \right) = \sin(\omega) \frac{\partial \widehat{f}}{\partial s}(\omega,s).$ This fact turns out to be very useful in the numerical solution of partial differential equations.

In this project we will study a version of the Radon transform known as the discrete Radon transform (DRT), as well as its inverse: the inverse discrete Radon transform (IDRT). This will be coupled to certain numerical methods in space and time for solving a broad class of time-dependent partial differential equations of the form: $\frac{\partial q}{\partial t} + A \frac{\partial q}{\partial x} + B \frac{\partial q}{\partial y} = 0,$ where the unknown function is $${q}(t,x,y): {\mathbb R} \times {\mathbb R} \times {\mathbb R} \mapsto {\mathbb R}^M$$ is vector-valued and $$A, B \in {\mathbb R}^{M \times M}$$ are real symmetric matrices.

Prerequisites:

• Basic programming skills (any programming language is fine)
• Some background in numerical analysis (introductory undergraduate level)
• Some familiarity with partial differential equations (introductory undergraduate level)
• Strong mathematical curiosity is expected of the participants
• The group will make use of the Python programming language (NumPy, SciPy, Matplotlib), although no prior knowledge is assumed.

## Spectral Graph Theory (SpeGra) Group:

Project Title: Variations of distance matrices for graphs