Sum Check Protocol - part 2: Triangle counting

To understand how the triangle counting can be represented by a polynomial, which is the form the sum check protocol described in Sum check - overview requires, it would be helpful to understand how the triangles can be represented using a adjacency matrix beforehand. The intuition around its matrix representation is fundamental to understanding its polynomial form from my view.

This post focuses explaining how to counting triangles in adjacency matrix, laying down the connection to the Sum check - benchmarks in which triangle counting will be applied into sum check protocol.

Below outlines the features of triangle counting using operations under matrix structure.

Overview

Adjacency matrix is an representation of undirected graph to denote relationship between vertices and edges.

Before we dive into understanding how the graph works at low level, here are some high level implications for the triangle counting use case.

Representation

Firstly, let's take a look at the polynomial representing triangle counting.

fA(x,y)fA(y,z)fA(z,x)

But what does this polynomial implies? What does fA means? What do x,y,z variable represent?

To clarify these questions, it would be helpful to take a look at the following examples of what they represent for A, A2 and A3 respectively.

Trips of A

This represents the counts of trips having one walk.
assets/Triangle counting/A 3.png|200

A can be represented as the function fA(x,y) for the following mappings with one walk paths.
(x=1,y=2)(12)
(x=1,y=3)(13)
(x=2,y=3)(23)
(x=2,y=4)(24)
(x=3,y=4)(34)

Corresponding, the following graph illustrates one walk paths.

flowchart LR
   1 --> 2 & 3
   2 --> 4
   2 --> 3
   3 --> 4

Since the adjacency matrix is symmetry, so the arrow directions can be reversed.

Trips of A2

This represents the counts of trips having two walks.

Round trip example

assets/Triangle counting/A^2-round.png
A2 can be represented as the function fA(x,y)fA(y,z)=fA(x,z) for the following mappings with two walk paths.

fA(x,y)
(x=1,y=2)(12)
(x=1,y=3)(13)

fA(y,z)
(y=2,z=1)(21)
(y=3,z=1)(31)

Corresponding, the following graph illustrates two walk paths for fA(x,y)fA(y,z).

flowchart LR
   1 --> 2 & 3--> 1

The example above also can be represented by (x=1,1y4,z=1)fA(x,y)fA(y,z). We can see there are 2 paths for round trips, and the result is denoted in the fA(x,z) where x=1,z=1, which is a point in the matrix trace.

Intermediate trip example

The values other than the ones from the trace of the matrix represent the number intermediate trips, instead of round trips.

assets/Triangle counting/A^2-intermediate.png

For point (2,3) in A2, that is fA2(2,3)=(x=2,1y4,z=3)fA(x,y)fA(y,z), there are 2 trips involves two walks as below.

flowchart LR
   2 --> 1 & 4--> 3

Trips of A3

This represents the number of trips having three walks. When it is a count for round trip, the value in the trace of A3, it represents the number of triangle, starting from a vertice.

Round trip example

assets/Triangle counting/A^3.png

For the point (2,2) in A3, that is fA3(2,2)=(x=2,1z4)fA2(x,z)fA(z,x), there are 3 round trips involves three walks as below.

flowchart LR
   2 --> 1 & 3 & 4--> 2

Seemingly, there only 3 round trips with two walks around vertice 2. But keep in mind the fA2 represents two walk trips, which can be expanded in a graph.

With the trips represented by fA2 expanded between the vertices 2 and 3, the following two triangles can be formed, as the there are 2 intermediate trips represented by fA2(2,3)fA(3,2)

flowchart LR
   2 --> 1 --> 3 --> 2
flowchart LR
   2 --> 4 --> 3 --> 2

With fA2(2,1)fA(1,2) expanded

flowchart LR
   2 --> 3 --> 1 --> 2

With fA2(2,4)fA(4,2) expanded

flowchart LR
   2 --> 3 --> 4 --> 2

Thus, there are total 4 undirected round trips as the fA3(2,2) denotes.

Matrix trace

Based on the method we observed in counting the round trips, to calculate the total number of triangles, it just need to sum up the values in the trace of A3 as the result C.

Because the adjacency matrix in our example is symmetry, so is undirected. The result C must be divided by 2. Also the triangle paths can start from any one of the 3 vertices and the paths will be redundant for counting a triangle. So the result C also needs to be divided by 3. Therefore, it needs to be divided by 6.

Into polynomial

The implication of summing up the trace values of matrix multiplication result A3 means counting round trips with 3 walks through the paths between vertices (x,y,z). Intuitively, the triangle counting can then be represented by a polynomial form

C=16x,y,z{0,1}lognfA(x,y)fA(y,z)fA(z,x)

In the Sum check - benchmarks, we will explain how this polynomial can be applied to the sum check protocol.