WSET L2 Studying Notes
My studying notes of WSET L2 Test.
My studying notes of WSET L2 Test.
What is Database Management System(DBMS)? In short, a DBMS should support the following functionalities:
What is Link Prediction? Given a snapshot of a network at time $t$, predict edges added in the interval $(t,t’)$
In a graph structure, what can be called a community? The notion of community structure captures the tendency of nodes to be organized into groups and members within a community are more similar among each other. Typically, a community in graphs/networks is a set of nodes with more/better/stronger connections between its members, than to the rest of the network. However, there is no widely accepted single definition. It depends heavily on the application domain and the properies of the graph.
Labelling/Annotating data is very expensive. Therefore, usualy we have only small amounts of labelled data and large amounts of unlabelled data. To predict the labels fo the unlabelled data, we can use the graph-based semi-supervised machine learning technique called Label Propagation.
So what is label propagation?
Label Propagation Algorithm (LPA) is an iterative algorithm where we assign labels to unlabelled points by propagating labels through the dataset. For example, if we are given a graph with partially labelled nodes, for any unlabelled node, we can either adopt the dominant label in its neighborhood or wait until a label “propagates” to it if its neighborhood does not have any label.
In this post, we first discuss the structure of the Web as a graph consisting a large number of pages and connected by hyperlinks. Then to rank the pages in the World Wide Web, we introduce PageRank and talk about some of its weakness as a general approach.
A random walk is known as a stochastic or random process which describes a path that consists of a succession of random steps on some mathematical space:
The Watts-Strogatz model is a random graph generation model that produces graphs with small-world properties, including short average path lengths and high clustering. To check the simulation of a small world model, this website is very helpful.
In June 2006, MSN Messenger had 30 billion conversations among 240 million people. From the data, the MSN network is constructed as a communication graph with $n = 180 \text{ million}$ nodes and $m = 1.3 \text{ billion}$ undirected edges. To investigate the properties of large-scale networks, we take the real-world data, MSN network, as an example and discover its macroscopic properties:
Kleinberg’s model presents the infinite family of navigable Small-World networks that generalizes Watts-Strogatz model. Moreover, with Kleinberg’s model it is shown that short paths not only exist but can be found with limited knowledge of the global network. Decentralized search algorithms can find short paths with high probability.