Important separators and parameterized algorithms

(1)

Important separators and parameterized algorithms

Dániel Marx¹

1Institute for Computer Science and Control, Hungarian Academy of Sciences (MTA SZTAKI)

Budapest, Hungary

School on Parameterized Algorithms and Complexity Będlewo, Poland

August 22, 2014 1

(2)

Definition: δ(R)is the set of edges with exactly one endpoint inR. Definition: A setS of edges is aminimal (X,Y)-cutif there is no X−Y path inG\S and no proper subset ofS breaks everyX−Y path.

Observation: Every minimal(X,Y)-cutS can be expressed asS = δ(R)for some X ⊆R andR∩Y =∅.

R δ(R) X Y

Important cuts

2

(3)

Definition

A minimal(X,Y)-cutδ(R)is importantif there is no(X,Y)-cut δ(R⁰) withR ⊂R⁰ and|δ(R⁰)|≤ |δ(R)|.

Note: Can be checked in polynomial time if a cut is important (δ(R) is important ifR =R_max).

R δ(R) X Y

Important cuts

2

(4)

Definition

R⁰ δ(R)

R

δ(R⁰)

X Y

Important cuts

2

(5)

Definition

R δ(R)

X Y

Important cuts

2

(6)

The number of important cuts can be exponentially large.

Example:

X Y

1 2 k/2

This graph has2^k/2 important(X,Y)-cuts of size at mostk.

Theorem

There are at most4^k important(X,Y)-cuts of size at mostk.

Important cuts

3

(7)

The number of important cuts can be exponentially large.

Example:

X Y

1 2 k/2

This graph has2^k/2 important(X,Y)-cuts of size at mostk.

Theorem

There are at most4^k important(X,Y)-cuts of size at mostk.

Important cuts

3

(8)

A new technique used by several results:

Multicut[M. and Razgon STOC 2011]

Clustering problems [Lokshtanov and M. ICALP 2011]

Directed Multiway Cut [Chitnis, Hajiaghayi, M. SODA 2012]

Directed Multicut in DAGs [Kratsch, Pilipczuk, Pilipczuk, Wahlström ICALP 2012]

Directed Subset Feedback Vertex Set ^[Chitnis, Cygan, Hajiaghayi, M. ICALP 2012]

Parity Multiway Cut [Lokshtanov, Ramanujan ICALP 2012]

List homomorphism removal problems [Chitnis, Egri, and M.

ESA 2013]

. . . more work in progress.

Randomized sampling of important cuts

4

(9)

We want to partition objects into clusters subject to certain requirements (typically: related objects are clustered together, bounds on the number or size of the clusters etc.)

(p,q)-clustering

Input: A graphG, integers p,q.

Find:

A partition(V₁, . . . ,V_m)of V(G)such that for every i

|V_i| ≤p and δ(V_i)≤q.

δ(V_i): number of edges leavingV_i. Theorem

(p,q)-clusteringcan be solved in time 2^O(q)·n^O(1).

Clustering

5

(10)

Good cluster: size at most p and at most q edges leaving it.

Necessary condition:

Every vertex is contained in a good cluster.

But surprisingly, this is also asufficient condition! Lemma

GraphG has a(p,q)-clustering if and only if every vertex is in a good cluster.

A sufficient and necessary condition

6

(11)

Good cluster: size at most p and at most q edges leaving it.

Necessary condition:

Every vertex is contained in a good cluster.

But surprisingly, this is also asufficient condition!

Lemma

A sufficient and necessary condition

6

(12)

Lemma

Proof: Find a collection of good clusters covering every vertex and having minimum total size. Suppose two clusters intersect.

X Y

δ(X) +δ(Y)≥δ(X\Y) +δ(Y \X) (posimodularity)

A sufficient and necessary condition

7

(13)

Lemma

X Y

⇒either δ(X)≥δ(X \Y) or δ(Y)≥δ(Y \X) holds.

A sufficient and necessary condition

7

(14)

Lemma

X \Y Y

Ifδ(X)≥δ(X \Y), replace X with X \Y, strictly decreasing the total size of the clusters.

A sufficient and necessary condition

7

(15)

Lemma

X Y \X

Ifδ(Y)≥δ(Y \X), replace Y withY \X,

strictly decreasing the total size of the clusters. QED

A sufficient and necessary condition

7

(16)

We have seen:

Lemma

All we have to do is to check if a given vertexv is in a good cluster. Trivial to do in timen^O(q).

We prove next: Lemma

We can check in time2^O(q)·n^O(1) ifv is in a good cluster.

Finding a good cluster

8

(17)

We have seen:

Lemma

All we have to do is to check if a given vertexv is in a good cluster. Trivial to do in timen^O(q).

We prove next:

Lemma

We can check in time2^O(q)·n^O(1) ifv is in a good cluster.

Finding a good cluster

8

(18)

Definition

Fix a distinguished vertexv in a graphG. A set X ⊆V(G) is an important setif

v 6∈X,

there is no set X ⊂X⁰ with v 6∈X andδ(X⁰)≤δ(X).

v

Observation: X is an important set if and only ifδ(X) is an important(x,v)-cut for everyx ∈X.

Consequence: Every vertex is contained in at most4^k important sets.

Important sets

9

(19)

Definition

v 6∈X,

v

Important sets

9

(20)

Definition

v 6∈X,

v

Important sets

9

(21)

Definition

v 6∈X,

v

Important sets

9

(22)

Definition

v 6∈X,

v

Important sets

9

(23)

Definition

v 6∈X,

v

Important sets

9

(24)

Lemma

IfC is a good cluster of minimum size containing v, then every component ofG \C is an important set.

v

ThusC can be obtained by removing at mostq important sets from V(G) (but there aren^O(q) possibilities, we cannot try all of them).

Pushing argument

10

(25)

Lemma

v

Pushing argument

10

(26)

Lemma

v

Pushing argument

10

(27)

Lemma

v

Pushing argument

10

(28)

Lemma

v

Pushing argument

10

(29)

Lemma

v

Pushing argument

10

(30)

Let X be the set of all important sets of boundary size at most q in G.

Let X⁰⊆ X contain each set with probability ¹₂ independently.

Let Z =S

X∈X⁰X.

Let B be the set of vertices inC with neighbors outside C. Lemma

LetC be a good cluster of minimum size containingv. With probability2⁻²^O(q),Z coversG \C and is disjoint from B.

v B

Random sampling

11

(31)

Let X⁰⊆ X contain each set with probability ¹₂ independently.

Let Z =S

X∈X⁰X.

v B

Random sampling

11

(32)

Lemma

Two events:

(E1) Z coversG\C.

Each of the at most q components is an important set

⇒ all of them are selected by probability at least2^−q. (E2) Z is disjoint fromB.

Each vertex ofB is in at most 4^q members of X

⇒ all of them are selected by probability at least2^−q4^q. The two events are independent (involve different sets ofX), thus the claimed probability follows.

Random sampling

12

(33)

LetC be a good cluster of minimum size containingv and assume G \C is covered by Z, and

Z is disjoint fromB (hence no edge going out of C is contained inZ).

v Z

G\Z

Where is the good clusterC in the figure?

Finding good clusters

13

(34)

v Z

G\Z

Where is the good clusterC in the figure?

Observe: Components of Z are either fully in the cluster or fully outside the cluster. What is this problem?

Finding good clusters

13

(35)

v Z

G\Z

KNAPSACK!

Finding good clusters

13

(36)

v Z

G\Z

We interpret the componentsV₁,. . .,V_t of G[Z]as items:

V_i has value δ(V_i) and V_i has weight |V_i|.

The goal is to select items with total value at least δ(Z)−q and total weight at mostp− |V(G)\Z|.

Finding good clusters by Knapsack

¹⁴

(37)

v Z

G\Z

Standard DP solves it in polynomial time: letT[i,j]be the maximum value of a subset of the firsti items having total weight at mostj. Recurrence:

T[i,j] =max{T[i−1,j],T[i−1,j− |V_i|] +δ(V_i)}

Finding good clusters by Knapsack

¹⁴

(38)

(p,q)-clustering

Input: A graphG, integers p,q.

Find:

A partition(V₁, . . . ,V_m)of V(G)such that for every i

|V_i| ≤p and δ(V_i)≤q.

It is sufficient to check for each vertex v if it is in a good cluster.

Enumerate all the important sets.

Let Z be the union of random important sets.

The solution is obtained by extendingG \Z with some of the components ofG[Z].

Knapsack.

Summary of algorithm

15

(39)

Let X⁰⊆ X contain each set X with probability 4^−|δ(X^)| . Let Z =S

X∈X⁰X.

LetC be a good cluster of minimum size containingv. With probability2^−O^(q),Z coversG\C and is disjoint from B.

v B

Random sampling — better probability

16

(40)

Lemma

We need to bound the probability of two independent events:

(E1) Z coversG\C. (E2) Z is disjoint fromB.

Random sampling — better probability

17

(41)

Lemma

(E1) Z coversG\C.

Probability of selecting every componentK₁,. . .,K_t of G\C:

t

Y

i=1

4^−|δ(Kⁱ^)|=4⁻^P^tⁱ⁼¹^|δ(Kⁱ^)|=4^−|δ(C)|≥4^−q.

Random sampling — better probability

17

(42)

Lemma

(E2) Z is disjoint fromB.

Recall: P

S∈S4^−|S| holds for the setS of important cuts.

Probability that no important sets containingw ∈B is selected:

Y

X∈X w∈X

(1−4^−|δ(X^)|)≈ Y

X∈X w∈X

exp −4^−|δ(X^)|

=exp − X

X∈X w∈X

4^−|δ(X^)|

≥1/e.

Thus the probability that no vertex ofB is covered is 2^−O(|B|): Y

X∈X X∩B6=∅

(1−4^−|δ(X^)|)≥ Y

w∈B

Y

X∈X w∈X

(1−4^−|δ(X^)|) =2^−O(|B^|) =2^−O(q).

Random sampling — better probability

17

(43)

Randomized 2^O^(q)·n^O(1) time algorithm for (p,q)-clustering.

Derandomization is possible using standard techniques, but nontrivial to obtain 2^O(q) running time.

Parameterization by p: we can get a2^O^(p)·n^O(1) time algorithm.

Other variants: maximum degree in the cluster is at most p, etc.

(p, q) -clustering

¹⁸

(44)

LetG be a graph and let F be a set of subgraphs in G. Definition

F-transversal: a set of edges of vertices intersecting each subgraph inF (i.e., “hitting” or “killing” every object in F).

Classical problems formulated as finding a minimum transversal:

s−t Cut:

F is the set of s−t paths.

Multiway Cut:

F is the set of paths between terminals.

(Directed) Feedback Vertex Set: F is the set of (directed) cycles.

Delete edges/vertices to make the graph bipartite:

F is the set of odd cycles.

v is in a(p,q)-cluster:

F is the set of all connected graphs of sizep+1 containingv.

Transversal problems

19

(45)

LetF be a set of connected(not necessarily disjoint!) subgraphs, eachintersecting a setT of vertices.

t₁ t₂ t₃ t₄

S

shadow

Theshadowof anF-transversalS is the set of vertices not reachable fromT in G\S.

The setting

20

(46)

LetF be a set of connected(not necessarily disjoint!) subgraphs, eachintersecting a setT of vertices.

t₁ t₂ t₃ t₄

S

shadow

Theshadowof anF-transversalS is the set of vertices not reachable fromT in G\S.

The setting

20

(47)

Shadow: Set of vertices not reachable inG\S.

Condition: everyF ∈ F isconnectedandintersectsT. Theorem

In2^O(k)·n^O⁽¹⁾ time, we can compute a setZ with the following property. If there exists anF-transversal of at mostk edges, then with probability2^−O(k) there is a minimum F-transversalS with

the shadow of S is covered by Z and no edge of S is contained in Z.

Note: The algorithm does nothave to knowF!

Proof idea: we can assume that every component of the shadow is an important set (solution can be pushed towardsT). Random selection as in the clustering problem.

What is this good for?

The random sampling (undirected edge version)

21

(48)

Shadow: Set of vertices not reachable inG\S.

Condition: everyF ∈ F isconnectedandintersectsT. Theorem

Note: The algorithm does nothave to knowF!

Proof idea: we can assume that every component of the shadow is an important set (solution can be pushed towardsT). Random selection as in the clustering problem.

What is this good for?

The random sampling (undirected edge version)

21

(49)

F is the set of all connected graphs of sizep+1containingv.

v is in a(p,q)-cluster m

F-transversal ofq edges exists.

v B

Theorem

In2^O(k)·n^O(1) time, we can compute a setZ with the following property. If there exists anF-transversal of at mostk edges, then with probability2^−O(k) there is a minimum F-transversalS with

the shadow of S is covered byZ and no edge of S is contained in Z. Lemma

(p, q)-clusters as F -transversal

22

(50)

v B

Theorem

Lemma

(p, q)-clusters as F -transversal

22

(51)

v B

Theorem

the shadow of S is covered by Z and no edge of S is contained in Z. Lemma

(p, q)-clusters as F -transversal

22

(52)

(Directed) Multiway Cut

Input: GraphG, set of verticesT, integer k

Find: A set S of at most k vertices such that G \S has no (directed)t₁−t₂ path for anyt₁,t₂ ∈T

We have seen:

Theorem

Multiway cutcan be solved in time4^k ·n^O(1). Directed version:

Theorem

Directed Multiway Cutis FPT.

Can be formulated as minimumF-transversal, whereF is the set of directed paths between vertices ofT.

Multiway cut

²³

(53)

Shadow: those vertices ofG \S that cannot be reached fromT ANDthose vertices ofG \S from whichT cannot be reached.

S

t₁ t₂

t₃ t₁

Directed Multiway Cut

24

(54)

Shadow: those vertices ofG \S that cannot be reached fromT ANDthose vertices ofG \S from whichT cannot be reached.

Condition: for every F ∈ F and every vertex v ∈F, there is a T →v and av →T path in F.

Theorem

Inf(k)·n^O(1) time, we can compute a setZ with the following property. If there exists anF-transversal of at mostk vertices, then with probability2^−O(k²⁾ there is a minimum F-transversalS with

the shadow of S is covered by Z and S∩Z =∅.

Now:

T: terminals

F contains every directed path between two distinct terminals

The random sampling (directed vertex version)

25

(55)

We can assume thatZ is disjoint from the solution, so we want to get rid ofZ.

Deleting Z is not a good idea: can make the problem easier.

To compensate deleting Z, if there is an a→b path with internal vertices in Z, add a direct a→b edge.

t4

t3

t2

t₁

Z

Crucial observation:

S remains a solution(since Z is disjoint from S) and

S is ashadowless solution (since Z covers the shadow of S).

Shadow removal

26

(56)

t4

t3

t2

t₁

Z

a b

Shadow removal

26

(57)

t4

t3

t2

t₁

Z

a b

Shadow removal

26

(58)

t4

t3

t2

t₁

Z

a b

S remains a solution (since Z is disjoint from S) and

S is a shadowless solution (since Z covers the shadow of S).

Shadow removal

26

(59)

How does a shadowless solution look like?

S

t₁ t₂

t₃ t₁

It is an undirected multiway cut in the underlying undirected graph!

⇒Problem can be reduced to undirected multiway cut.

Shadowless solutions

27

(60)

S

t₁ t₂

t₃ t₁

Shadowless solutions

27

(61)

S

t₁ t₂

t₃ t₁

Shadowless solutions

27

(62)

A simple (but essentially tight) bound on the number of important cuts.

Algorithmic results: FPT algorithms for Multiway Cutin undirected graphs, Skew Multicutin directed graphs,

Directed Feedback Vertex/Edge Set, (p,q)-Clustering,

Directed Multiway Cut.

Summary

28