WSET L2 Studying Notes

By Chih-Ling Hsu

2023-05-20

My studying notes of WSET L2 Test.

Introduction to Database

By Chih-Ling Hsu

2021-06-01

What is Database Management System(DBMS)? In short, a DBMS should support the following functionalities:

store data (large amount of data!)
query data
modify data (add, update, delete, …)
enable durability (recover from failure)
control access of many users
- isolation (each transaction appears to execute in isolation from other transactions)
- atomicity (prevents updates to the database occurring only partially)
Allow users to create new databases and specify their schemas using a data definition language

Link Prediction

By Chih-Ling Hsu

2020-05-25

What is Link Prediction? Given a snapshot of a network at time $t$, predict edges added in the interval $(t,t’)$

Community Detection in Graphs

By Chih-Ling Hsu

2020-05-25

In a graph structure, what can be called a community? The notion of community structure captures the tendency of nodes to be organized into groups and members within a community are more similar among each other. Typically, a community in graphs/networks is a set of nodes with more/better/stronger connections between its members, than to the rest of the network. However, there is no widely accepted single definition. It depends heavily on the application domain and the properies of the graph.

Label Propagation

By Chih-Ling Hsu

2020-05-24

Labelling/Annotating data is very expensive. Therefore, usualy we have only small amounts of labelled data and large amounts of unlabelled data. To predict the labels fo the unlabelled data, we can use the graph-based semi-supervised machine learning technique called Label Propagation.

So what is label propagation?

Label Propagation Algorithm (LPA) is an iterative algorithm where we assign labels to unlabelled points by propagating labels through the dataset. For example, if we are given a graph with partially labelled nodes, for any unlabelled node, we can either adopt the dominant label in its neighborhood or wait until a label “propagates” to it if its neighborhood does not have any label.

Web Graph and Page Ranking

By Chih-Ling Hsu

2020-05-22

In this post, we first discuss the structure of the Web as a graph consisting a large number of pages and connected by hyperlinks. Then to rank the pages in the World Wide Web, we introduce PageRank and talk about some of its weakness as a general approach.

Random Walks on Graphs

By Chih-Ling Hsu

2020-05-20

A random walk is known as a stochastic or random process which describes a path that consists of a succession of random steps on some mathematical space:

given a graph and a starting point, select a neighbour at random
move to the selected neighbour and repeat the same process till a termination condition is verified
the random sequence of points selected in this way is a random walk of the graph

Watts-Strogatz Model of Small-Worlds

By Chih-Ling Hsu

2020-05-15

The Watts-Strogatz model is a random graph generation model that produces graphs with small-world properties, including short average path lengths and high clustering. To check the simulation of a small world model, this website is very helpful.

A Real-World Example of Network Graph

By Chih-Ling Hsu

2020-05-15

In June 2006, MSN Messenger had 30 billion conversations among 240 million people. From the data, the MSN network is constructed as a communication graph with $n = 180 \text{ million}$ nodes and $m = 1.3 \text{ billion}$ undirected edges. To investigate the properties of large-scale networks, we take the real-world data, MSN network, as an example and discover its macroscopic properties:

Diameter - What is the degrees of separation in this network?
Clustering coefficient - What fraction of my friends are also friends themselves?
Degree distribution - Are there many “hubs” in the network?
Connectivity - How many “islands” and how big are they?

Kleinberg's Model of Small-Worlds

By Chih-Ling Hsu

2020-05-15

Kleinberg’s model presents the infinite family of navigable Small-World networks that generalizes Watts-Strogatz model. Moreover, with Kleinberg’s model it is shown that short paths not only exist but can be found with limited knowledge of the global network. Decentralized search algorithms can find short paths with high probability.

An Explorer of Things

Hello, my name is Chihling :)

WSET L2 Studying Notes

Introduction to Database

Link Prediction

Community Detection in Graphs

Label Propagation

Web Graph and Page Ranking

Random Walks on Graphs

Watts-Strogatz Model of Small-Worlds

A Real-World Example of Network Graph

Kleinberg's Model of Small-Worlds