Machine learning and portfolio selections. II.
L´aszl´o (Laci) Gy¨orfi1
1Department of Computer Science and Information Theory Budapest University of Technology and Economics
Budapest, Hungary
September 22, 2007
e-mail: gyorfi@szit.bme.hu www.szit.bme.hu/˜gyorfi www.szit.bme.hu/˜oti/portfolio
Dynamic portfolio selection: general case
xi = (xi(1), . . .xi(d)) the return vector on day i b=b1 is the portfolio vector for the first day initial capitalS0
S1=S0· hb1,x1i
for the second day,S1 new initial capital, the portfolio vector b2=b(x1)
S2 =S0· hb1,x1i · hb(x1),x2i.
nth day a portfolio strategy bn=b(x1, . . . ,xn−1) =b(xn−11 ) Sn=S0
n
Y
i=1
D
b(xi−11 ),xi E
=S0enWn(B) with the average growth rate
Wn(B) = 1 n
n
X
i=1
ln D
b(xi−11 ),xi
E .
Dynamic portfolio selection: general case
xi = (xi(1), . . .xi(d)) the return vector on day i b=b1 is the portfolio vector for the first day initial capitalS0
S1=S0· hb1,x1i
for the second day,S1 new initial capital, the portfolio vector b2=b(x1)
S2 =S0· hb1,x1i · hb(x1),x2i.
nth day a portfolio strategy bn=b(x1, . . . ,xn−1) =b(xn−11 ) Sn=S0
n
Y
i=1
D
b(xi−11 ),xi E
=S0enWn(B) with the average growth rate
Wn(B) = 1 n
n
X
i=1
ln D
b(xi−11 ),xi
E .
Dynamic portfolio selection: general case
xi = (xi(1), . . .xi(d)) the return vector on day i b=b1 is the portfolio vector for the first day initial capitalS0
S1=S0· hb1,x1i
for the second day,S1 new initial capital, the portfolio vector b2=b(x1)
S2 =S0· hb1,x1i · hb(x1),x2i.
nth day a portfolio strategy bn=b(x1, . . . ,xn−1) =b(xn−11 ) Sn=S0
n
Y
i=1
D
b(xi−11 ),xi E
=S0enWn(B) with the average growth rate
n
log-optimum portfolio
X1,X2, . . . drawn from the vector valued stationary and ergodic process
log-optimum portfolioB∗ ={b∗(·)}
E{ln
b∗(Xn−11 ),Xn
|Xn−11 }= max
b(·) E{ln
b(Xn−11 ),Xn
|Xn−11 }
Xn−11 =X1, . . . ,Xn−1
Optimality
Algoet and Cover (1988): IfSn∗ =Sn(B∗) denotes the capital after dayn achieved by a log-optimum portfolio strategyB∗, then for any portfolio strategyBwith capital Sn=Sn(B) and for any process{Xn}∞−∞,
lim sup
n→∞
1
n lnSn−1 nlnSn∗
≤0 almost surely
for stationary ergodic process{Xn}∞−∞,
n→∞lim 1
n lnSn∗ =W∗ almost surely, where
W∗=E
maxb(·) E{ln
b(X−1−∞),X0
|X−1−∞}
is the maximal growth rate of any portfolio.
Optimality
Algoet and Cover (1988): IfSn∗ =Sn(B∗) denotes the capital after dayn achieved by a log-optimum portfolio strategyB∗, then for any portfolio strategyBwith capital Sn=Sn(B) and for any process{Xn}∞−∞,
lim sup
n→∞
1
n lnSn−1 nlnSn∗
≤0 almost surely for stationary ergodic process{Xn}∞−∞,
n→∞lim 1
n lnSn∗ =W∗ almost surely, where
W∗=E
maxb(·) E{ln
b(X−1−∞),X0
|X−1−∞}
is the maximal growth rate of any portfolio.
Martingale difference sequences
for the proof of optimality we use the concept of martingale differences:
Definition
there are two sequences of random variables:
{Zn} {Xn}
Zn is a function of X1, . . . ,Xn,
E{Zn|X1, . . . ,Xn−1}= 0 almost surely.
Then{Zn} is called martingale difference sequence with respect to {Xn}.
A strong law of large numbers
Chow Theorem: If{Zn}is a martingale difference sequence with respect to{Xn}and
∞
X
n=1
E{Zn2} n2 <∞ then
n→∞lim 1 n
n
X
i=1
Zi = 0 a.s.
A weak law of large numbers
Lemma: If{Zn} is a martingale difference sequence with respect to{Xn}then {Zn}are uncorrelated.
Proof. Put i <j.
E{ZiZj} = E{E{ZiZj |X1, . . . ,Xj−1}}
= E{ZiE{Zj |X1, . . . ,Xj−1}}
= E{Zi ·0}= 0 Corollary
E
1 n
n
X
i=1
Zi
!2
= 1
n2
n
X
i=1 n
X
j=1
E{ZiZj}
= 1
n2
n
X
i=1
E{Zi2}
→ 0
if, for example,E{Zi2} is a bounded sequence.
A weak law of large numbers
Lemma: If{Zn} is a martingale difference sequence with respect to{Xn}then {Zn}are uncorrelated.
Proof. Put i <j. E{ZiZj}
= E{E{ZiZj |X1, . . . ,Xj−1}}
= E{ZiE{Zj |X1, . . . ,Xj−1}}
= E{Zi ·0}= 0 Corollary
E
1 n
n
X
i=1
Zi
!2
= 1
n2
n
X
i=1 n
X
j=1
E{ZiZj}
= 1
n2
n
X
i=1
E{Zi2}
→ 0
if, for example,E{Zi2} is a bounded sequence.
A weak law of large numbers
Lemma: If{Zn} is a martingale difference sequence with respect to{Xn}then {Zn}are uncorrelated.
Proof. Put i <j.
E{ZiZj} = E{E{ZiZj |X1, . . . ,Xj−1}}
= E{ZiE{Zj |X1, . . . ,Xj−1}}
= E{Zi ·0}= 0 Corollary
E
1 n
n
X
i=1
Zi
!2
= 1
n2
n
X
i=1 n
X
j=1
E{ZiZj}
= 1
n2
n
X
i=1
E{Zi2}
→ 0
if, for example,E{Zi2} is a bounded sequence.
A weak law of large numbers
Lemma: If{Zn} is a martingale difference sequence with respect to{Xn}then {Zn}are uncorrelated.
Proof. Put i <j.
E{ZiZj} = E{E{ZiZj |X1, . . . ,Xj−1}}
= E{ZiE{Zj |X1, . . . ,Xj−1}}
= E{Zi ·0}= 0 Corollary
E
1 n
n
X
i=1
Zi
!2
= 1
n2
n
X
i=1 n
X
j=1
E{ZiZj}
= 1
n2
n
X
i=1
E{Zi2}
→ 0
if, for example,E{Zi2} is a bounded sequence.
A weak law of large numbers
Lemma: If{Zn} is a martingale difference sequence with respect to{Xn}then {Zn}are uncorrelated.
Proof. Put i <j.
E{ZiZj} = E{E{ZiZj |X1, . . . ,Xj−1}}
= E{ZiE{Zj |X1, . . . ,Xj−1}}
= E{Zi ·0}
= 0 Corollary
E
1 n
n
X
i=1
Zi
!2
= 1
n2
n
X
i=1 n
X
j=1
E{ZiZj}
= 1
n2
n
X
i=1
E{Zi2}
→ 0
if, for example,E{Zi2} is a bounded sequence.
A weak law of large numbers
Lemma: If{Zn} is a martingale difference sequence with respect to{Xn}then {Zn}are uncorrelated.
Proof. Put i <j.
E{ZiZj} = E{E{ZiZj |X1, . . . ,Xj−1}}
= E{ZiE{Zj |X1, . . . ,Xj−1}}
= E{Zi ·0}= 0
Corollary E
1 n
n
X
i=1
Zi
!2
= 1
n2
n
X
i=1 n
X
j=1
E{ZiZj}
= 1
n2
n
X
i=1
E{Zi2}
→ 0
if, for example,E{Zi2} is a bounded sequence.
A weak law of large numbers
Lemma: If{Zn} is a martingale difference sequence with respect to{Xn}then {Zn}are uncorrelated.
Proof. Put i <j.
E{ZiZj} = E{E{ZiZj |X1, . . . ,Xj−1}}
= E{ZiE{Zj |X1, . . . ,Xj−1}}
= E{Zi ·0}= 0 Corollary
E
1 n
n
X
i=1
Zi
!2
= 1
n2
n
X
i=1 n
X
j=1
E{ZiZj}
= 1
n2
n
X
i=1
E{Zi2}
→ 0
if, for example,E{Zi2} is a bounded sequence.
A weak law of large numbers
Lemma: If{Zn} is a martingale difference sequence with respect to{Xn}then {Zn}are uncorrelated.
Proof. Put i <j.
E{ZiZj} = E{E{ZiZj |X1, . . . ,Xj−1}}
= E{ZiE{Zj |X1, . . . ,Xj−1}}
= E{Zi ·0}= 0 Corollary
E
1 n
n
X
i=1
Zi
!2
= 1
n2
n
X
i=1 n
X
j=1
E{ZiZj}
= 1
n2
n
X
i=1
E{Zi2}
→ 0
if, for example,E{Zi2} is a bounded sequence.
A weak law of large numbers
Lemma: If{Zn} is a martingale difference sequence with respect to{Xn}then {Zn}are uncorrelated.
Proof. Put i <j.
E{ZiZj} = E{E{ZiZj |X1, . . . ,Xj−1}}
= E{ZiE{Zj |X1, . . . ,Xj−1}}
= E{Zi ·0}= 0 Corollary
E
1 n
n
X
i=1
Zi
!2
= 1
n2
n
X
i=1 n
X
j=1
E{ZiZj}
= 1
n2
n
X
i=1
E{Zi2}
→ 0
if, for example,E{Zi2} is a bounded sequence.
A weak law of large numbers
Lemma: If{Zn} is a martingale difference sequence with respect to{Xn}then {Zn}are uncorrelated.
Proof. Put i <j.
E{ZiZj} = E{E{ZiZj |X1, . . . ,Xj−1}}
= E{ZiE{Zj |X1, . . . ,Xj−1}}
= E{Zi ·0}= 0 Corollary
E
1 n
n
X
i=1
Zi
!2
= 1
n2
n
X
i=1 n
X
j=1
E{ZiZj}
= 1
n2
n
X
i=1
E{Zi2}
→ 0
if, for example,E{Zi2} is a bounded sequence.
Constructing martingale difference sequence
{Yn} is an arbitrary sequence such that Yn is a function of X1, . . . ,Xn
Put
Zn=Yn−E{Yn|X1, . . . ,Xn−1} Then{Zn} is a martingale difference sequence:
Zn is a function of X1, . . . ,Xn,
E{Zn|X1, . . . ,Xn−1}
= E{Yn−E{Yn|X1, . . . ,Xn−1} |X1, . . . ,Xn−1}
= 0
almost surely.
Constructing martingale difference sequence
{Yn} is an arbitrary sequence such that Yn is a function of X1, . . . ,Xn
Put
Zn=Yn−E{Yn|X1, . . . ,Xn−1} Then{Zn} is a martingale difference sequence:
Zn is a function of X1, . . . ,Xn,
E{Zn|X1, . . . ,Xn−1}
= E{Yn−E{Yn|X1, . . . ,Xn−1} |X1, . . . ,Xn−1}
= 0
almost surely.
Constructing martingale difference sequence
{Yn} is an arbitrary sequence such that Yn is a function of X1, . . . ,Xn
Put
Zn=Yn−E{Yn|X1, . . . ,Xn−1} Then{Zn} is a martingale difference sequence:
Zn is a function of X1, . . . ,Xn,
E{Zn|X1, . . . ,Xn−1}
= E{Yn−E{Yn|X1, . . . ,Xn−1} |X1, . . . ,Xn−1}
= 0
almost surely.
Constructing martingale difference sequence
{Yn} is an arbitrary sequence such that Yn is a function of X1, . . . ,Xn
Put
Zn=Yn−E{Yn|X1, . . . ,Xn−1} Then{Zn} is a martingale difference sequence:
Zn is a function of X1, . . . ,Xn,
E{Zn|X1, . . . ,Xn−1}
= E{Yn−E{Yn|X1, . . . ,Xn−1} |X1, . . . ,Xn−1}
= 0
almost surely.
Constructing martingale difference sequence
{Yn} is an arbitrary sequence such that Yn is a function of X1, . . . ,Xn
Put
Zn=Yn−E{Yn|X1, . . . ,Xn−1} Then{Zn} is a martingale difference sequence:
Zn is a function of X1, . . . ,Xn,
E{Zn|X1, . . . ,Xn−1}
= E{Yn−E{Yn|X1, . . . ,Xn−1} |X1, . . . ,Xn−1}
= 0
Optimality
log-optimum portfolioB∗ ={b∗(·)}
E{ln
b∗(Xn−11 ),Xn
|Xn−11 }= max
b(·) E{ln
b(Xn−11 ),Xn
|Xn−11 }
IfSn∗ =Sn(B∗) denotes the capital after day n achieved by a log-optimum portfolio strategyB∗, then for any portfolio strategy Bwith capital Sn=Sn(B) and for any process {Xn}∞−∞,
lim sup
n→∞
1
n lnSn−1 nlnSn∗
≤0 almost surely
Optimality
log-optimum portfolioB∗ ={b∗(·)}
E{ln
b∗(Xn−11 ),Xn
|Xn−11 }= max
b(·) E{ln
b(Xn−11 ),Xn
|Xn−11 }
IfSn∗ =Sn(B∗) denotes the capital after day n achieved by a log-optimum portfolio strategyB∗, then for any portfolio strategy Bwith capital Sn=Sn(B) and for any process {Xn}∞−∞,
lim sup
n→∞
1
n lnSn−1 nlnSn∗
≤0 almost surely
Proof of optimality
1
n lnSn = 1 n
n
X
i=1
ln D
b(Xi−11 ),Xi
E
= 1
n
n
X
i=1
E{lnD
b(Xi−11 ),Xi
E
|Xi1−1}
+ 1
n
n
X
i=1
lnD
b(Xi−11 ),XiE
−E{lnD
b(Xi−11 ),XiE
|Xi−11 }
and 1
n lnSn∗ = 1 n
n
X
i=1
E{lnD
b∗(Xi−11 ),XiE
|Xi−11 }
+ 1
n
n
X
i=1
lnD
b∗(Xi−11 ),XiE
−E{lnD
b∗(Xi−11 ),XiE
|Xi1−1}
Proof of optimality
1
n lnSn = 1 n
n
X
i=1
ln D
b(Xi−11 ),Xi
E
= 1
n
n
X
i=1
E{lnD
b(Xi−11 ),Xi
E
|Xi1−1}
+ 1
n
n
X
i=1
lnD
b(Xi−11 ),XiE
−E{lnD
b(Xi−11 ),XiE
|Xi−11 }
and 1
n lnSn∗ = 1 n
n
X
i=1
E{lnD
b∗(Xi−11 ),XiE
|Xi−11 }
+ 1
n
n
X
i=1
lnD
b∗(Xi−11 ),XiE
−E{lnD
b∗(Xi−11 ),XiE
|Xi1−1}
Proof of optimality
1
n lnSn = 1 n
n
X
i=1
ln D
b(Xi−11 ),Xi
E
= 1
n
n
X
i=1
E{lnD
b(Xi−11 ),Xi
E
|Xi1−1}
+ 1
n
n
X
i=1
lnD
b(Xi−11 ),XiE
−E{lnD
b(Xi−11 ),XiE
|Xi−11 }
and 1
n lnSn∗ = 1 n
n
X
i=1
E{lnD
b∗(Xi−11 ),XiE
|Xi−11 }
+ 1
n
n
X
i=1
lnD
b∗(Xi−11 ),XiE
−E{lnD
b∗(Xi−11 ),XiE
|Xi1−1}
Universally consistent portfolio
These limit relations give rise to the following definition:
Definition
An empirical (data driven) portfolio strategyBis called
universally consistent with respect to a class C of stationary and ergodic processes{Xn}∞−∞,if for each process in the class,
n→∞lim 1
nlnSn(B) =W∗ almost surely.
Empirical portfolio selection
E{ln
b∗(Xn−11 ),Xn
|Xn−11 }= max
b(·) E{ln
b(Xn−11 ),Xn
|Xn−11 }
fixed integerk >0 E{ln
b(Xn−11 ),Xn
|Xn−11 } ≈E{ln
b(Xn−1n−k),Xn
|Xn−1n−k} and
b∗(Xn−11 )≈bk(Xn−1n−k) =arg max
b(·)
E{ln
b(Xn−1n−k),Xn
|Xn−1n−k} because of stationarity
bk(xk1) = arg max
b(·)
E{lnD
b(xk1),Xk+1 E
|Xk1 =xk1}
= arg max
b
E{lnhb,Xk+1i |Xk1 =xk1}, which is the maximization of the regression function
mb(xk1) =E{lnhb,Xk+1i |Xk1 =xk1}
Empirical portfolio selection
E{ln
b∗(Xn−11 ),Xn
|Xn−11 }= max
b(·) E{ln
b(Xn−11 ),Xn
|Xn−11 } fixed integerk >0
E{ln
b(Xn−11 ),Xn
|Xn−11 } ≈E{ln
b(Xn−1n−k),Xn
|Xn−1n−k}
and
b∗(Xn−11 )≈bk(Xn−1n−k) =arg max
b(·)
E{ln
b(Xn−1n−k),Xn
|Xn−1n−k} because of stationarity
bk(xk1) = arg max
b(·)
E{lnD
b(xk1),Xk+1 E
|Xk1 =xk1}
= arg max
b
E{lnhb,Xk+1i |Xk1 =xk1}, which is the maximization of the regression function
mb(xk1) =E{lnhb,Xk+1i |Xk1 =xk1}
Empirical portfolio selection
E{ln
b∗(Xn−11 ),Xn
|Xn−11 }= max
b(·) E{ln
b(Xn−11 ),Xn
|Xn−11 } fixed integerk >0
E{ln
b(Xn−11 ),Xn
|Xn−11 } ≈E{ln
b(Xn−1n−k),Xn
|Xn−1n−k} and
b∗(Xn−11 )≈bk(Xn−1n−k) =arg max
b(·)
E{ln
b(Xn−1n−k),Xn
|Xn−1n−k}
because of stationarity bk(xk1) = arg max
b(·)
E{lnD
b(xk1),Xk+1 E
|Xk1 =xk1}
= arg max
b
E{lnhb,Xk+1i |Xk1 =xk1}, which is the maximization of the regression function
mb(xk1) =E{lnhb,Xk+1i |Xk1 =xk1}
Empirical portfolio selection
E{ln
b∗(Xn−11 ),Xn
|Xn−11 }= max
b(·) E{ln
b(Xn−11 ),Xn
|Xn−11 } fixed integerk >0
E{ln
b(Xn−11 ),Xn
|Xn−11 } ≈E{ln
b(Xn−1n−k),Xn
|Xn−1n−k} and
b∗(Xn−11 )≈bk(Xn−1n−k) =arg max
b(·)
E{ln
b(Xn−1n−k),Xn
|Xn−1n−k} because of stationarity
bk(xk1) = arg max
b(·)
E{lnD
b(xk1),Xk+1 E
|Xk1 =xk1}
= arg max
b
E{lnhb,Xk+1i |Xk1 =xk1},
which is the maximization of the regression function mb(xk1) =E{lnhb,Xk+1i |Xk1 =xk1}
Empirical portfolio selection
E{ln
b∗(Xn−11 ),Xn
|Xn−11 }= max
b(·) E{ln
b(Xn−11 ),Xn
|Xn−11 } fixed integerk >0
E{ln
b(Xn−11 ),Xn
|Xn−11 } ≈E{ln
b(Xn−1n−k),Xn
|Xn−1n−k} and
b∗(Xn−11 )≈bk(Xn−1n−k) =arg max
b(·)
E{ln
b(Xn−1n−k),Xn
|Xn−1n−k} because of stationarity
bk(xk1) = arg max
b(·)
E{lnD
b(xk1),Xk+1 E
|Xk1 =xk1}
= arg max
b
E{lnhb,Xk+1i |Xk1 =xk1}, which is the maximization of the regression function
mb(xk1) =E{lnhb,Xk+1i |Xk1 =xk1}
Regression function
Y real valued X observation vector
Regression function
m(x) =E{Y |X =x} i.i.d. data: Dn={(X1,Y1), . . . ,(Xn,Yn)} Regression function estimate
mn(x) =mn(x,Dn) local averaging estimates
mn(x) =
n
X
i=1
Wni(x;X1, . . . ,Xn)Yi L. Gy¨orfi, M. Kohler, A. Krzyzak, H. Walk (2002) A Distribution-Free Theory of Nonparametric Regression, Springer-Verlag, New York.
Regression function
Y real valued X observation vector Regression function
m(x) =E{Y |X =x}
i.i.d. data: Dn={(X1,Y1), . . . ,(Xn,Yn)}
Regression function estimate
mn(x) =mn(x,Dn) local averaging estimates
mn(x) =
n
X
i=1
Wni(x;X1, . . . ,Xn)Yi L. Gy¨orfi, M. Kohler, A. Krzyzak, H. Walk (2002) A Distribution-Free Theory of Nonparametric Regression, Springer-Verlag, New York.
Regression function
Y real valued X observation vector Regression function
m(x) =E{Y |X =x}
i.i.d. data: Dn={(X1,Y1), . . . ,(Xn,Yn)}
Regression function estimate
mn(x) =mn(x,Dn)
local averaging estimates mn(x) =
n
X
i=1
Wni(x;X1, . . . ,Xn)Yi L. Gy¨orfi, M. Kohler, A. Krzyzak, H. Walk (2002) A Distribution-Free Theory of Nonparametric Regression, Springer-Verlag, New York.
Regression function
Y real valued X observation vector Regression function
m(x) =E{Y |X =x}
i.i.d. data: Dn={(X1,Y1), . . . ,(Xn,Yn)}
Regression function estimate
mn(x) =mn(x,Dn) local averaging estimates
mn(x) =
n
X
i=1
Wni(x;X1, . . . ,Xn)Yi
L. Gy¨orfi, M. Kohler, A. Krzyzak, H. Walk (2002) A Distribution-Free Theory of Nonparametric Regression, Springer-Verlag, New York.
Regression function
Y real valued X observation vector Regression function
m(x) =E{Y |X =x}
i.i.d. data: Dn={(X1,Y1), . . . ,(Xn,Yn)}
Regression function estimate
mn(x) =mn(x,Dn) local averaging estimates
mn(x) =
n
X
i=1
Wni(x;X1, . . . ,Xn)Yi L. Gy¨orfi, M. Kohler, A. Krzyzak, H. Walk (2002) A
Correspondence
X ∼ Xk1
Y ∼ lnhb,Xk+1i
m(x) =E{Y |X =x} ∼ mb(xk1) =E{lnhb,Xk+1i |Xk1 =xk1}
Correspondence
X ∼ Xk1
Y ∼ lnhb,Xk+1i
m(x) =E{Y |X =x} ∼ mb(xk1) =E{lnhb,Xk+1i |Xk1 =xk1}
Correspondence
X ∼ Xk1
Y ∼ lnhb,Xk+1i
m(x) =E{Y |X =x} ∼ mb(xk1) =E{lnhb,Xk+1i |Xk1 =xk1}
Partitioning regression estimate
PartitionPn={An,1,An,2. . .}
An(x) is the cell of the partition Pn into which x falls mn(x) =
Pn
i=1YiI[Xi∈An(x)] Pn
i=1I[Xi∈An(x)]
LetGn be the quantizer corresponding to the partition Pn: Gn(x) =j ifx ∈An,j.
the set of matches
In(x) ={i ≤n: Gn(x) =Gn(Xi)} Then
mn(x) = P
i∈In(x)Yi
|In(x)| .
Partitioning regression estimate
PartitionPn={An,1,An,2. . .}
An(x) is the cell of the partition Pn into which x falls
mn(x) = Pn
i=1YiI[Xi∈An(x)] Pn
i=1I[Xi∈An(x)]
LetGn be the quantizer corresponding to the partition Pn: Gn(x) =j ifx ∈An,j.
the set of matches
In(x) ={i ≤n: Gn(x) =Gn(Xi)} Then
mn(x) = P
i∈In(x)Yi
|In(x)| .
Partitioning regression estimate
PartitionPn={An,1,An,2. . .}
An(x) is the cell of the partition Pn into which x falls mn(x) =
Pn
i=1YiI[Xi∈An(x)]
Pn
i=1I[Xi∈An(x)]
LetGn be the quantizer corresponding to the partition Pn: Gn(x) =j ifx ∈An,j.
the set of matches
In(x) ={i ≤n: Gn(x) =Gn(Xi)} Then
mn(x) = P
i∈In(x)Yi
|In(x)| .
Partitioning regression estimate
PartitionPn={An,1,An,2. . .}
An(x) is the cell of the partition Pn into which x falls mn(x) =
Pn
i=1YiI[Xi∈An(x)]
Pn
i=1I[Xi∈An(x)]
LetGn be the quantizer corresponding to the partition Pn: Gn(x) =j ifx ∈An,j.
the set of matches
In(x) ={i ≤n: Gn(x) =Gn(Xi)} Then
mn(x) = P
i∈In(x)Yi
|In(x)| .
Partitioning regression estimate
PartitionPn={An,1,An,2. . .}
An(x) is the cell of the partition Pn into which x falls mn(x) =
Pn
i=1YiI[Xi∈An(x)]
Pn
i=1I[Xi∈An(x)]
LetGn be the quantizer corresponding to the partition Pn: Gn(x) =j ifx ∈An,j.
the set of matches
In(x) ={i ≤n: Gn(x) =Gn(Xi)}
Then
mn(x) = P
i∈In(x)Yi
.
Partitioning-based portfolio selection
fixk, `= 1,2, . . .
P`={A`,j,j = 1,2, . . . ,m`}finite partitions of Rd,
G` be the corresponding quantizer: G`(x) =j, if x∈A`,j. G`(xn1) =G`(x1), . . . ,G`(xn),
the set of matches:
Jn={k <i <n:G`(xi−1i−k) =G`(xn−1n−k)}
b(k,`)(xn−11 ) =arg max
b
X
i∈Jn
lnhb,xii
if the setIn is non-void, and b0 = (1/d, . . . ,1/d) otherwise.
Partitioning-based portfolio selection
fixk, `= 1,2, . . .
P`={A`,j,j = 1,2, . . . ,m`}finite partitions of Rd, G` be the corresponding quantizer: G`(x) =j, if x∈A`,j.
G`(xn1) =G`(x1), . . . ,G`(xn), the set of matches:
Jn={k <i <n:G`(xi−1i−k) =G`(xn−1n−k)}
b(k,`)(xn−11 ) =arg max
b
X
i∈Jn
lnhb,xii
if the setIn is non-void, and b0 = (1/d, . . . ,1/d) otherwise.
Partitioning-based portfolio selection
fixk, `= 1,2, . . .
P`={A`,j,j = 1,2, . . . ,m`}finite partitions of Rd, G` be the corresponding quantizer: G`(x) =j, if x∈A`,j. G`(xn1) =G`(x1), . . . ,G`(xn),
the set of matches:
Jn={k <i <n:G`(xi−1i−k) =G`(xn−1n−k)}
b(k,`)(xn−11 ) =arg max
b
X
i∈Jn
lnhb,xii
if the setIn is non-void, and b0 = (1/d, . . . ,1/d) otherwise.
Partitioning-based portfolio selection
fixk, `= 1,2, . . .
P`={A`,j,j = 1,2, . . . ,m`}finite partitions of Rd, G` be the corresponding quantizer: G`(x) =j, if x∈A`,j. G`(xn1) =G`(x1), . . . ,G`(xn),
the set of matches:
Jn={k <i <n:G`(xi−1i−k) =G`(xn−1n−k)}
b(k,`)(xn−11 ) =arg max
b
X
i∈Jn
lnhb,xii
if the setIn is non-void, and b0 = (1/d, . . . ,1/d) otherwise.
Partitioning-based portfolio selection
fixk, `= 1,2, . . .
P`={A`,j,j = 1,2, . . . ,m`}finite partitions of Rd, G` be the corresponding quantizer: G`(x) =j, if x∈A`,j. G`(xn1) =G`(x1), . . . ,G`(xn),
the set of matches:
Jn={k <i <n:G`(xi−1i−k) =G`(xn−1n−k)}
b(k,`)(xn−11 ) =arg max
b
X
i∈Jn
lnhb,xii
if the setIn is non-void, and b0 = (1/d, . . . ,1/d) otherwise.
Elementary portfolios
for fixedk, `= 1,2, . . .,
B(k,`) ={b(k,`)(·)}, are called elementary portfolios
That is,b(k,`)n quantizes the sequencexn−11 according to the partitionP`, and browses through all past appearances of the last seen quantized stringG`(xn−1n−k) of lengthk.
Then it designs a fixed portfolio vector according to the returns on the days following the occurence of the string.
Elementary portfolios
for fixedk, `= 1,2, . . .,
B(k,`) ={b(k,`)(·)}, are called elementary portfolios
That is,b(k,`)n quantizes the sequencexn−11 according to the partitionP`, and browses through all past appearances of the last seen quantized stringG`(xn−1n−k) of lengthk.
Then it designs a fixed portfolio vector according to the returns on the days following the occurence of the string.
Combining elementary portfolios
How to choosek, `
small k or small `: large bias
large k and large`: few matching, large variance
Machine learning: combination of experts
N. Cesa-Bianchi and G. Lugosi,Prediction, Learning, and Games. Cambridge University Press, 2006.
Combining elementary portfolios
How to choosek, `
small k or small `: large bias
large k and large`: few matching, large variance Machine learning: combination of experts
N. Cesa-Bianchi and G. Lugosi,Prediction, Learning, and Games.
Cambridge University Press, 2006.
Exponential weighing
combine the elementary portfolio strategiesB(k,`)={b(k,`)n }
let{qk,`} be a probability distribution on the set of all pairs (k, `) such that for allk, `,qk,`>0.
forη >0 put
wn,k,`=qk,`eηlnSn−1(B(k,`)) forη = 1,
wn,k,`=qk,`elnSn−1(B(k,`))=qk,`Sn−1(B(k,`)) and
vn,k,`= wn,k,`
P
i,jwn,i,j. the combined portfoliob:
bn(xn−11 ) =X
k,`
vn,k,`b(k,`)n (xn−11 ).
Exponential weighing
combine the elementary portfolio strategiesB(k,`)={b(k,`)n } let{qk,`} be a probability distribution on the set of all pairs (k, `) such that for allk, `,qk,`>0.
forη >0 put
wn,k,`=qk,`eηlnSn−1(B(k,`)) forη = 1,
wn,k,`=qk,`elnSn−1(B(k,`))=qk,`Sn−1(B(k,`)) and
vn,k,`= wn,k,`
P
i,jwn,i,j. the combined portfoliob:
bn(xn−11 ) =X
k,`
vn,k,`b(k,`)n (xn−11 ).
Exponential weighing
combine the elementary portfolio strategiesB(k,`)={b(k,`)n } let{qk,`} be a probability distribution on the set of all pairs (k, `) such that for allk, `,qk,`>0.
forη >0 put
wn,k,`=qk,`eηlnSn−1(B(k,`))
forη = 1,
wn,k,`=qk,`elnSn−1(B(k,`))=qk,`Sn−1(B(k,`)) and
vn,k,`= wn,k,`
P
i,jwn,i,j. the combined portfoliob:
bn(xn−11 ) =X
k,`
vn,k,`b(k,`)n (xn−11 ).
Exponential weighing
combine the elementary portfolio strategiesB(k,`)={b(k,`)n } let{qk,`} be a probability distribution on the set of all pairs (k, `) such that for allk, `,qk,`>0.
forη >0 put
wn,k,`=qk,`eηlnSn−1(B(k,`)) forη = 1,
wn,k,`=qk,`elnSn−1(B(k,`))=qk,`Sn−1(B(k,`))
and
vn,k,`= wn,k,`
P
i,jwn,i,j. the combined portfoliob:
bn(xn−11 ) =X
k,`
vn,k,`b(k,`)n (xn−11 ).
Exponential weighing
combine the elementary portfolio strategiesB(k,`)={b(k,`)n } let{qk,`} be a probability distribution on the set of all pairs (k, `) such that for allk, `,qk,`>0.
forη >0 put
wn,k,`=qk,`eηlnSn−1(B(k,`)) forη = 1,
wn,k,`=qk,`elnSn−1(B(k,`))=qk,`Sn−1(B(k,`)) and
vn,k,`= wn,k,`
P
i,jwn,i,j.
the combined portfoliob: bn(xn−11 ) =X
k,`
vn,k,`b(k,`)n (xn−11 ).