Spectral Feature of Graph Signals
In the previous post, Graph signals, its Laplacian matrix, and some exiting features were represented. In this post, two main Spectral features of a graph and parametric filter localization of graph signal will be represented.
Spectral Filtering is Independent to the Dummy Nodes
If we want to make all given graphs have the same nodes, we need to add some dummy nodes which has no connection to other nodes. Spectral filtering is independent of that newly added dummy nodes. Does not matter how many nodes we add. The filtering works exactly the same.
To test this feature, the same graph and filter from the previous post are used. In addition to that 100 nodes graph, 100 dummy nodes which located (0,0) position without any connection to other nodes were created. Then the same filter applied on the same signal. The result is in the following figure. Also, that demo is in the demo5.m script file.
Since all dummy nodes located in the same position, all 100 dummy node are seen in the same point which has no connection to other nodes in the top right figure. At the bottom figure, the node’s signal value for both two graphs is represented. First 100 nodes value (the original nodes) are the same, last 100 node’s (dummy nodes) signal are all zero.
Spectral Filtering is Independent to the Order of the Nodes
This feature is very important for graph signal processing. Hence the graph cannot carry any information by the order of the nodes but just their connections. So valid Graph filtering must not change if the order of the nodes is changed. Which means does not matter which node is the first, which nodes is the second, so on so forth. To test this feature, we created a demo6.m script. Within that demo, we created 100 dummy node like previous nodes then we randomly reordered all 200 nodes and created a new graph. Since the result is exactly the same with above Fig1 we do not put the result here.
Filter Localization
When the filters are to be learned like in convolution neural network, we need to localize filter, which means we somehow make sure that the same filter must be applied all over the signals which come from different graphs. When U and λs are eigenvector and eigenvalues of Laplacian of the given graph, f is the filter function respect to the eigenvalue, s is the input signal and sf is the filtered signal, the spectral filtering on graph signal can be written as follows.
In that equation, f function is the one which is to be learned. All other variables depend on the given signal and graph. Note the number of node n, the dimension of U, the dimension of s can be changed for the different graph. Do not need to say even there is the same number of node since the graphs are different, eigenvectors and eigenvalues are can be different as well.
The first idea comes into mind is to define f function by the independent coefficient for each given eigenvalue, which means f(Λ) function consists of n length vector. Every number in that vector represent the coefficient of the corresponding eigenvector if all number of nodes get the same by adding dummy nodes, that idea seems possible. In that point localization problem appears. Suppose the last eigenvalue of the first signal is 5, but the last eigenvalue of the second signal is 10. That means the same coefficient (last element of f vector) will be the coefficient of the different frequency component which is totally wrong. Convolutional neural network theory tells us to apply the same effect of the filter ( or we can say coefficients which are shared) for different train set samples and finally find that common, shared filters coefficients.
Filter Localization by Polynomial Filters
To solve the filter localization problem, polynomial filters which represent the filter coefficient by the polynomial summation of given eigenvalue can be used. The following formula expresses that representation where K is the degree of the polynomial.
Also, we can reformulate it by matrix multiplication where C matrix is the power matrix of eigenvalues as follows.
Now regardless of the eigenvalues of the graph, learned common coefficients αs are used for calculating the coefficient of filter for each basis function. Within that representation, the filter is modeled by K+1 number of coefficients. Let’s say our filter has K+1 degree of freedom which has to be learned.
To demonstrate that filtering, we used predefined 50 and 100 nodes graph. the first node of each graph set as 1 and the rest of them are 0 as initially given so-called Dirac signal. K=15 coefficient is used. Polynomial filter coefficients were calculated to give the same result of the standard filter in the previous demo where flt =exp(-20*v) for 100 nodes graph. All codes are given in demo7.m. In that demo, we applied the same filter on graph1 and then reconstruct polynomial coefficient by inverse calculation. We called that calculated coefficient as learned parameters. Then we applied that learned parameters on graph2. We also compared that learned filter result with standard filtering using the given filter. The result is quite the same under some error margin because of using just K=15 coefficient.
Filter Localization by B-Spline
Hence we represented the filter by the sum of power series of the concerned eigenvalue, the filter result is changed very dramatically for small changes of higher order coefficient. That means the filter is too sensitive to high order coefficient of the filter. That makes learning those coefficients difficult. another smart way to localize filter is resampling the eigenvalues over a common discrete basis (Let’s say resampling it over λ*=0, 0.2, 0.4, …,2.0) and rewrite the eigenvalues of that newly created eigenvalues by the weighted sum of original eigenvalues. That solution gets the idea of B-spline into mind immediately. Because B-spline is a spline function of given degree which can be expressed as a linear combination of B-splines of that degree. Roughly speaking b-spline analysis gives us linear transformation matrix B which transform original eigenvector and eigenvalues into desired eigenvalues and its corresponding eigenvectors. Assume coefficient of b-spline shown by nxm dimensional B matrix, and the coefficient vector to be learned is shown by α, the filter Λ can be defined as follows.
Let’s test that idea on the graph in the first demo in the previous post. In that demo, the graph consists of 101 nodes and each node connected to previous and next nodes like one big circle. That graph’s maximum eigenvalue is 4. But we wanted to obtain the eigenvector of 21 valued eigenvalues where λ*=0, 0.1, 0.2, …,2.0. Here are some initial eigenvector and b-spline transformed one. You can find that code in the demo8.m script.
The above figure shows 5th,11th and last original eigenvectors on the first row, and spline basis transformed ones on the second row.
There are just two parameters of b-spline transformation. These are the number of bases which we chose 21 (it is the degree of freedom, number of parameters to be learned) in the above demo and maximum eigenvalues which we selected it 2. Within that way, we just need to learn the coefficient of transformed eigenvectors.
So, let’s do the filtering on the second graph by the learned b-spline coefficient from the first graph. demo9.m is prepared for this aim. We again calculated the b-spline filter coefficients from graph1 where it gives the same result by flt =exp(-20*v). Then we applied that learned coefficient to graph2 signal. To do comparison we also applied the standard filter to that graph signal as well. We used 50 discrete point eigenvalues from 0 to 8 which means the learned coefficient vector consists of 50 numbers.
To make sure that both filtering on original eigenvector basis and transformed basis are the same, demo10.m script is prepared. In this demo, we presented the same filtering with demo5 on both original basis and spline transformed basis by 25 coefficient. Finally here is the signal level of nodes for both two methods as a comparison.
Filter Localization by Chebyshev Expansion
So far, b-spline approach sounds very good. However, if the graphs are a big graph where the number of nodes far more bigger than the b-spline coefficient, because of eigenvector multiplication complexity, it becomes less efficient in terms of computational time. In this case, we also can use Chebyshev expansion which is the sequence of orthogonal polynomials defined by recursive relation instead of eigenvectors. Roughly speaking, instead of calculation of eigenvector and eigenvalues, we can write any graph’s signal’s orthogonal basis recursively. The number of basis depends on our choice and this number will be the number of parameters to be learned.
According to the method, first, the Laplacian of the graph must be normalized where its eigenvalues are in [-1 1]. Then the desired number of basis can be created according to given recursive equations. Finally, filtered signal sf can be calculated by multiplying that basis matrix and filter coefficient α. Since there is K+1 number of basis, the filter vector α consist of K+1 coefficient which is to be learned. When L is given laplacian of graph, I is eye matrix with the same size of L, s is signal on that graph, K is the desired number of basis of the filter, α is coefficient of the filter and sf is the filtered signal, Chebyshev filtering can be shown by following equations.
We prepared the same manner demo named demo11.m. Again we learned Chebyshev filter coefficient from graph1 by flt =exp(-20*v).Then we used that coefficient on filtering graph2 signals. To make a comparison standard filtering results of graph2 was represented as well in the following figure.
General View of Filtering
As a result, we have 4 options for filtering. Direct filtering, polynomial filtering, b-spline filtering, and Chebyshev filtering. Here is the general formula of each filter respectively.
In all formula W refers coefficient which is to be learned, s is input signal, sf is output signal, U is eigenvectors. In the first formula which is ideal filtering for the given graph and its signal, W coefficient matrix consist of the number of node variable and each element is the coefficient of the corresponding eigenvalue. that is why the first one is the ideal filter but not useful on the neural network because of filter localization problem. In the rest of the filters, W coefficient vector consist of K number of variables and each variable is more or less for the same frequency component regardless of the graph.
What is Next?
Now we are ready to create Spectral-Based Graph Convolutional Neural Network for various signals which come different graphs.
Note
This research was funded by LITIS Lab at University of Rouen-Normandy. It is part of AGAC Project (Projet Régional RIN Recherche sur l’analyse de graphes et ses applications à la chemoinformatique).
You can find the demo code on my GitHub profile.