Bounding the degree - Eﬀective results for Diophantine problems over ﬁnitely generated domains

In this section we shall prove (7.11) of Proposition 7.3 and (7.14) of Proposition 7.4.

We recall some results on function fields in one variable. Let k be an algebraically closed field of characteristic 0, z a transcendental element over kand M a finite extension of k(z). Denote byg_M/k the genus of M, and by M_M the collection of valuations of M/k, these are the discrete valuations ofM with value groupZthat are trivial onk. Recall that these valuations satisfy the sum formula

v∈M_M

v(α) = 0 for α∈M^∗.

For a finite subset S of M_M, an element α ∈ M is called an S-integer if v(α) ≥ 0 for all v ∈ M_M\S. TheS-integers form a ring inM, denoted by O_S. The (homogeneous) height of a= (α₁, . . . , α_l)∈M^l relative to M/k is defined by

HM(a) =HM(α1, . . . , αl) := − X

v∈M_M

min(v(α1), . . . , v(αl)),

and we define the height H_M(f) of a polynomial f ∈ M[X] by the height of the vector defined by the coefficients off. Further, we shall write H_M(1,a) :=H_M(1, α₁, . . . , α_l). We note that

H_M(α_i)≤H_M(a)≤H_M(α₁) +· · ·+H_M(α_l), i= 1, . . . , l. (7.17) By the sum formula,

H_M(αa) = H_M(a) for α∈M^∗. (7.18)

The height of α ∈M relative to M/k is defined by H_M(α) :=H_M(1, α) =− X

v∈M_M

min(0, v(α)).

It is clear that H_M(α) = 0 if and only if α∈k. Using the sum formula, it is easy to prove that the height has the properties

H_M(α^l) = |l|H_M(α),

H_M(α+β)≤H_M(α) +H_M(β), H_M(αβ)≤H_M(α) +H_M(β) (7.19) for all non-zeroα, β ∈M and for every integer l.

If L is a finite extension of M, we have

H_L(α₀, . . . , α_l) = [L:M]H_M(α₀, . . . , α_l) for α₀, . . . , α_l∈M. (7.20) By degf we denote the total degree off ∈k[z]. Forf₀, . . . , f_l ∈k[z] with gcd(f₀, . . . , f_l) = 1 we have

H_k_[z](f₀, . . . , f_l) = max(degf₀, . . . ,degf_l). (7.21)

Lemma 7.6. Let

F =f₀X^l+f₁X^l−1+· · ·+f_l ∈M[X]

be a polynomial with f₀ 6= 0 and with non-zero discriminant. Let L be the splitting field over M of F. Then

g_L/k ≤[L:M]· g_M/k+lH_M(F) . In particular, if M =k(z) and f0, . . . , fl∈k[z], we have

g_L/k ≤[L:M]·lmax(degf0, . . . ,degfl).

Proof. The second assertion follows by combining the first assertion with (7.21). We now prove the first assertion. Our proof is a generalization of that of Lemma H of Schmidt [67].

Forv ∈ M_M, put v(F) := min(v(f₀), . . . , v(f_l)). LetD_F denote the discriminant of F. Since D_F is a homogeneous polynomial of degree 2l−2 in f₀, . . . , f_l, we have

v(D_F)≥(2l−2)v(F). (7.22)

Let S be the set of v ∈ M_M with v(f₀) > v(F) or v(D_F) >(2l−2)v(F). We show that L/M is unramified over every valuation v ∈ M_M \S.

Take v ∈ M_M \S. Let

O_v :={x∈M : v(x)≥0}, m_v :={x∈M : v(x)>0}

denote the local ring atv, and the maximal ideal of O_v, respectively. The residue class field O_v/m_v is equal to k since k is algebraically closed. Let ϕ_v : O_v →k denote the canonical homomorphism.

Without loss of generality, we assume v(F) = 0. Then v(f₀) = 0, v(D_F) = 0. Let ϕ_v(F) := Pl

j=0ϕ_v(f_j)X^l−j. Then ϕ_v(f₀) 6= 0 and ϕ_v(F) has discriminant ϕ_v(D_F) 6= 0.

Since D_F 6= 0, the polynomial F has l distinct zeros in L, α₁, . . . , α_l, say. Further, ϕ_v(F) has l distinct zeros in k,a₁, . . . , a_l, say.

Denote by Σ_l the permutation group on (1, . . . , l). Choosec₁, . . . , c_l∈k, such that the numbers

α_σ :=c₁α_σ(1)+· · ·+c_lα_σ(l) (σ ∈Σ_l) are all distinct, and the numbers

a_σ :=c₁a_σ(1)+· · ·+c_la_σ(l) (σ ∈Σ_l)

are all distinct. Let α := c₁α₁ +· · ·+c_lα_l. Then L = M(α), and the monic minimal polynomial of α overM divides G:=Q

σ∈Σl(X−α_σ) which by the theorem of symmetric

functions belongs to M[X]. The image ofG under ϕ_v is Q

σ∈Σl(X−a_σ) and this has only simple zeros. This implies that L/M is unramified at v.

Forv ∈ M_M and any valuation∈ M_Labovev, denote bye(V|v) the ramification index ofV overv. Recall thatP

V|ve(V|v) = [L:M], where the sum is taken over all valuations of L lying above v. Now the Riemann-Hurwitz formula implies that

2g_L/_k−2 = [L:M](2g_M/_k−2) +X

where |S| denotes the cardinality of S. It remains to estimate |S|. By the sum formula and (7.22) we have

By inserting this into (7.23) we arrive at an inequality which is stronger than what we

wanted to prove. 2

In the sequel we keep the notation of Proposition 6.1. To prove (7.11) and (7.14) we may suppose thatq >0 since the caseq = 0 is trivial. Let againK₀ :=Q(z₁, . . . , z_q),K :=

We mention that in view of Propositions 6.1, 7.2,

d₁ ≤(nd)^exp^O(r). (7.25)

7.2.1 Thue equations

As before, k is an algebraically closed field of characteristic 0, z a transcendental element over k and M a finite extension of k(z). Further, g_M/k denotes the genus of M, M_M the collection of valuations of M/k, and for a finite subset S of M_M, O_S denotes the ring of S-integers in M. We denote by |S| the cardinality of S.

Consider now the Thue equation

F(x, y) = 1 in x, y ∈ O_S, (7.26)

where F is a binary form of degree n ≥ 3 with coefficients in M and with non-zero discriminant.

Proposition 7.7. Every solution x, y ∈ O_S of (7.26) satisfies

max(H_M(x), H_M(y))≤89H_M(F) + 212g_M/_k+|S| −1. (7.27) Proof. This is Theorem 1 (ii) of Schmidt [67]. 2 We note that from Mason’s fundamental inequality concerning S-unit equations over function fields (see Mason [55]) one could deduce (7.27) with smaller constants than 89 and 212. However, this is irrelevant for the bounds in (3.5).

Now we use Proposition 7.7 to prove the statement (7.11) of Proposition 7.3.

Proof of (7.11). Recall thatw⁽¹⁾ :=w, . . . , w^(D)are the conjugates ofwoverK₀, and for α∈K we denote by α⁽¹⁾, . . . , α^(D) the conjugates of α corresponding to w⁽¹⁾, . . . , w^(D).

Recall also that for i = 1, . . . , q we defined ki := Q(z₁, . . . , zi−1, z_i+1, . . . , z_q) and ki

denotes its algebraic closure. Further, M_i denotes the splitting field of the polynomial X^D+F₁X^D−1+· · ·+F_D over ki(z_i). We put ∆_i := [M_i :ki(z_i)] and define

S_i :={v ∈ M_M_i : v(z_i)<0 or v(g)>0}.

The conjugates w^(j) (j = 1, . . . , D) lie in M_i and are all integral over ki[z_i]. Hence they belong to O_S_i. Further, g⁻¹ ∈ O_S_i. Consequently, if α ∈ B = A₀[w, g⁻¹], then α^(j) ∈ O_S_i for j = 1, . . . , D, i= 1, . . . , q.

Let x, y be a solution of equation (7.10). Put F⁰ :=δ⁻¹F, and let F^0(j) be the binary form obtained by taking the j-th conjugates of the coefficients of F⁰. Let j ∈ {1, . . . , D}, i∈ {1, . . . , q}. Then clearly, F^0(j) ∈M_i[X, Y], and

F^0(j)(x^(j), y^(j)) = 1, x^(j), y^(j)∈ O_S_i. So by Proposition 7.7 we obtain that

max(H_M_i(x^(j)), H_M_i(y^(j)))≤89H_M_i(F^(j)) + 212g_M_i +|S_i| −1. (7.28) We estimate the various parameters in this bound. We start with H_M_i(F^0(j)). We recall that F⁰(X, Y) = δ⁻¹(a₀Xⁿ+a₁Xⁿ⁻¹Y +· · ·+a_nYⁿ). Using (7.18), (7.17) and Lemma 6.7 we infer that

HMi(F^0(j)) = HMi(a^(j)₀ , . . . , a^(j)_n )≤HMi(a^(j)₀ ) +· · ·+HMi(a^(j)_n )

≤ ∆_i 2D(dega₀ +· · ·+ dega_n) +n(2d₀)^exp^O(r) . By Lemma 6.3 we have

dega_i ≤(2d^∗)^exp^O(r) for i= 0, . . . , n,

whered^∗ := max(d₀,deg ˜a_i)≤d. Further, we have d₀ ≤d,D≤d^r−q₀ ≤d^r. Thus we obtain that

H_M_i(F^0(j)) ≤ ∆_i 2D(n+ 1)(2d)^exp^O(r)+n(2d)^exp^O(r)

(7.29)

≤ ∆_i(nd)^exp^O(r).

Next, we estimate the genus. Using Lemma 7.6 with F(X) =F(X) = X^D +F₁X^D−1 +

· · ·+F_D, applying Proposition 6.1, and using d₀ ≤d, D≤d^r₀ ≤d^r, we infer that g_M_i ≤∆_iD max

1≤k≤Ddeg_z_iF_k ≤∆_iD(2d₀)^exp^O(r) ≤∆_i(nd)^exp^O(r). (7.30) Lastly, we estimate|S_i|. Each valuation ofki(z_i) can be extended to at most [M_i :ki(z_i)] =

∆_i valuations of M_i. Thus M_i has at most ∆_i valuations v with v(z_i) < 0 and at most

∆_idegf valuationsv with v(f)>0. Hence using Proposition 7.2, we get

|S_i| ≤∆_i+ ∆_ideg_z

if ≤∆_i(1 + degf)≤∆_i(nd)^exp^O(r). (7.31) By inserting the bounds (7.29), (7.30) and (7.31) into (7.28), we infer

max(HMi(x^(j)), HMi(y^(j)))≤∆i(nd)^exp^O(r). (7.32) In view of Lemma 6.6, (7.32), D≤d^r, q≤r and (7.25) we deduce that

degx,degy≤qDd₁+

i=1

∆⁻¹_i

j=1

H_M_i(x^(j))≤(nd)^exp^O(r).

This proves (7.11). 2

7.2.2 Hyper- and superelliptic equations

Recall the notation introduced at the beginning of Section 7.2. Again,kis an algebraically closed field of characteristic 0, z a transcendental element over k, M a finite extension of k(z), andS a finite subset of M_M.

Proposition 7.8. Let F ∈M[X] be a polynomial with non-zero discriminant and m ≥3 a given integer. Put n:= degF and assume n≥2. All solutions of the equation

F(x) =y^m in x, y ∈ OS (7.33)

have the property

H_M(x) ≤ (6n+ 18)H_M(F) + 6g_M/_k+ 2|S|, (7.34) mH_M(y) ≤ (6n²+ 18n+ 1)H_M(F) + 6ng_M/_k+ 2n|S|. (7.35) Proof. First assume that F splits into linear factors over M, and that S consists only of the infinite valuations of M, these are the valuations of M with v(z) < 0. Under these hypotheses, Mason [55, p.118, Theorem 15], proved that for every solution x, y of (7.33) we have

H_M(x)≤18H_M(F) + 6g_M/_k+ 2(|S| −1). (7.36) But Mason’s proof remains valid without any changes for any arbitrary finite set of places S. That is, (7.36) holds if F splits into linear factors overM, without any condition on S.

We reduce the general case, where the splitting field of M may be larger than M, to the case considered by Mason. Let L be the splitting field of F over M, and T the set of valuations of L that extend those of S. Then |T| ≤[L : M]· |S|, and by Lemma 7.6, we have g_L/_k ≤ [L: M]·(g_M/_k+nH_M(F)). Note that (7.36) holds, but with L, T instead of M, S. It follows that

[L:M]·H_M(x) = H_L(x) ≤ 18H_L(F) + 6g_L/k+ 2(|T| −1)

≤ [L:M] (6n+ 18)H_M(F) + 6g_M/_k+ 2|S|

which implies (7.34). Further,

mH_M(y) =H_M(y^m) =H_M(F(x))≤H_M(F) +nH_M(x), (7.37)

which gives (7.35). 2

Proposition 7.9. Let F ∈ M[X] be a polynomial with non-zero discriminant. Put n :=

degF and assume n ≥3. Then the solutions of

F(x) = y² in x, y ∈ O_S (7.38)

have the property

H_M(x) ≤ (42n+ 37)H_M(F) + 8g_M/_k+ 4|S|, (7.39) H_M(y) ≤ (21n²+ 19n)H_M(F) + 4ng_M/k+ 2n|S|. (7.40) Proof. First assume that F splits into linear factors over M, that S consists only of the infinite valuations of M, that F is monic, and that F has its coefficients in O_S. Under these hypotheses, Mason [55, p.30, Theorem 6] proved that for every solution of (7.38) we have

H_M(x)≤26H_M(F) + 8g_M/_k+ 4(|S| −1). (7.41) An inspection of Mason’s proof shows that his result is valid for arbitrary finite sets of valuationsS, not just the set of infinite valuations. This leaves only the conditions imposed onF.

We reduce the general case to the special case to which (7.41) is applicable. Let F = a0Xⁿ +· · ·+an. Let L be the splitting field of F ·(X² −a0) over M. Let T be the set of valuations ofLthat extend the valuations of S, and also the valuations v ∈ MM

such that v(F)<0. Further, let F⁰ =Xⁿ+a1Xⁿ⁻¹+a0a1Xⁿ⁻² +· · ·+aⁿ⁻¹₀ an, and let b be such that b² =aⁿ⁻¹₀ . Then for every solution x, y of (7.38) we have

F⁰(a₀x) = (by)², a₀x, by∈ O_T,

and moreover, F⁰ ∈ O_T[X], F⁰ is monic, and F⁰ splits into linear factors over L. So by (7.41),

H_L(a₀x)≤26H_L(F⁰) + 8g_L/_k+ 4(|T| −1). (7.42) First notice that

H_L(F⁰) = [L:M]H_M(F⁰)≤[L:M]·nH_M(F).

Further,

|T| ≤[L:M]

|S| − X

v∈MM

min(0, v(F))

≤[L:M] |S|+H_M(F) .

Finally, by H_M(F ·(X²−a₀))≤2H_M(F) and Lemma 7.6, we have g_L/_k ≤[L:M](g_M/_k+ (n+ 2)2H_M(F)).

By inserting these bounds into (7.42), we infer

[L:M]H_M(x) ≤ [L:M] H_M(a₀x) +H_M(F)

=H_L(a₀x) + [L:M]H_M(F)

≤ [L:M] (42n+ 37)H_M(F) + 8g_M/_k+ 4|S|

This implies (7.39). The other inequality (7.40) follows by combining (7.39) with (7.37)

with m= 2. 2

The final step of this subsection is to prove statement (7.14) in Proposition 7.4.

Proof of (7.14). We closely follow the proof of statement (7.11) in Proposition 7.3, and use the same notation. In particular, ki, M_i, S_i,∆_i will have the same meaning, and for α ∈ B, j = 1, . . . , D, the j-th conjugate α^(j) is the one corresponding to w^(j). Put F⁰ :=δ⁻¹F, and let F^0(j) be the polynomial obtained by taking the j-th conjugates of the coefficients of F⁰.

We keep the argument together for both hyper- and superelliptic equations by using the worse bounds everywhere. Let x, y ∈ B be a solution of (3.6), where m, n ≥ 2 and n≥3 if m= 2. Then

F^0(j)(x^(j)) = (y^(j))^m, x^(j), y^(j)∈ O_S_i. By combining Propositions 7.8 and 7.9 we obtain the generous bound

HMi(x^(j)), mHMi(y^(j)) ≤80n² HMi(F^0(j)) +g_M_i_/k_i +|Si| .

For HMi(F^0(j)), g_M_i_/k_i, |Si| we have precisely the same estimates as (7.29), (7.30), (7.31).

Then a similar computation as in the proof of (7.11) leads to

HMi(x^(j)), mHMi(y^(j)) ≤∆i(nd)^exp^O(r). (7.43) Now employing Lemma 6.6 and ignoring for the moment m we get similarly as in the proof of (7.11),

degx, degy≤(nd)^exp^O(r).

It remains to estimate mdegy. If y∈Q we have degy = 0. Assume that y6∈Q. Then y6∈ki for at least one indexi. Since y∈B ⊂ki(z_i, w) and [ki(z_i, w) :ki(z_i)]≤D, we have

H_M_i(y) = [M_i :ki(z_i, w)]H_k_i_(z_i_,w)(y)≥[M_i :ki(z_i, w)]≥∆_i/D.

Together with (7.43) and D≤d^r this implies

m ≤(nd)^exp^O(r).

This concludes the proof of (7.14). 2

In document Eﬀective results for Diophantine problems over ﬁnitely generated domains DSc dissertation (Pldal 89-98)