A Monte Carlo method for finding the radical

In this section we prove Theorem 4.3. Throughout the section we assume that K is a sufficiently large perfect field together with an efficient method for finding the square-free part of polynomials of degree n with SFK(n) operations. Also, K⁰ stands for an algebraic closure of K and A⁰ = K⁰ ⊗_K A. We think of A as embedded into A⁰. The input is the same as described in Section 4.3. We assume that random elements of A are generated independently according to a distribution satisfying conditionAlgRand(A, n², δ) defined in the introductory part of this chapter. The cost of selecting a single random element ofAis denoted by R(A). The algorithm follows the lines of the method described in Section 4.3.

We describe the main ingredients using the notation of Section 4.2.

4.5.1 Jordan decomposition

Let u ∈ M_n(K) be a matrix. Since K is perfect, there exists a semisimple matrix u_s ∈ M_n(K) and a nilpotent matrix u_n ∈ M_n(K) such that [u_s, u_n] = 0 and u = u_s +u_n (cf. Propositions 1.4.6 and 1.4.10 of [94]). Furthermore, u_s and u_n are unique with these properties and both belong to the matrix algebra generated by u. The decomposition u= u_s+u_nis referred to as the Jordan decomposition ofu. The matricesu_sandu_nare called the semisimple respectively the nilpotent part of u. In this section it will be more convenient to denoteu_sbyJ_s(u) andu_nbyJ_n(u). In [3] a method based on the Newton–Hensel lifting procedure is presented which calculates a polynomials(x)∈K[x] of degree less thannfrom the square-free part of the minimal polynomial of u such that s(u) = J_s(u). Combining this with Giesbrecht’s Las Vegas methods [41] for calculating the minimal polynomial and for evaluating s(u) we can compute J_s(u) with O(MM(n)polylogn+SF_K(n)) operations.

4.5.2 Finding a maximal torus

We show that the semisimple part of a random element generates a maximal torus with a good chance. The argument used here is a simplified (and improved) version of a proof given by Eberly and Giesbrecht [30] for a special case.

Lemma 4.14. Let d stand for the dimension of a maximal torus in A⁰. There exists a polynomial function f : A⁰ → K⁰ of degree d² −d such that for u ∈ A⁰ the subalgebra T⁰ generated by the semisimple part J_s(u) of u and the identity matrix is a maximal torus of A⁰ if and only if f(u)6= 0.

Proof. By Wedderburn’s theorem A⁰/Rad(A) ∼= Ls

i=1Mni(K⁰). A maximal torus in M_n_i(K⁰) is conjugated to the set of diagonal matrices. It follows that d = Ps

i=1n_i. We

assume that Ls

i=1M_n_i(K⁰) is embedded into M_d(K⁰) in the natural way. Let φ : A⁰ → M_d(K⁰) be the composition of the natural projection A⁰ → A⁰/Rad(A⁰) with this embed-ding. Observe that φ commutes with taking the semisimple part: φ(J_s(u)) = J_s(φ(u)) for every u ∈ A⁰. We claim that the torus T generated by the identity of A and semisimple partJ_s(u) of uhas dimensiondif and only ifφ(u) has ddistinct eigenvalues. Indeed, since kerφ = Rad(A⁰) and T ∩Rad(A⁰) = (0), T and φ(T) are isomorphic. On the other hand, φ(T) is generated by φ(J_s(u)) = J_s(φ(u)) and the identity, hence the dimension of φ(T) is the degree of the minimal polynomial of J_s(φ(u)) which equals the number of distinct eigenvalues of φ(u).

Let χ_u(x) denote the characteristic polynomial of the adjoint action adφ(u) : w 7→

φ(u)w−wφ(u) of φ(u) on Md(K⁰). We claim that the nullity of adφ(u) is at least d and equality holds if and only if φ(u) has d distinct eigenvalues. Indeed, we may assume that φ(u) is of Jordan normal form. One easily verifies that adφ(u) acts nilpotently on the block diagonal matrices whose blocks correspond to the Jordan blocks of φ(u). This implies the inequality and the ,,only if” part of the claim concerning the equality. The ,,if” part is even easier.

It follows that φ(u) has d eigenvalues if and only if the coefficient c_u of the term x^d in χ_u(x) is zero. Letf(u) stand for this coefficient. It is known that the coefficient ofx^l in the characteristic polynomial of a linear transformation on a vector spaceW is a homogeneous polynomial function on End(W) of degree dimW −l. In our case dimW =d² and l =d.

Our function f being the composition of a homogeneous polynomial function of degree d² −d and the linear maps ad and φ is either zero or homogeneous of degree d²−d. An elementu∈A⁰ such thatφ(u) is a diagonal matrix with distinct eigenvalues witnesses that this polynomial is not identically zero.

Thus a semisimple matrix u ∈ A such that the torus T generated by u is probably maximal (with error probabilityδ) can be found withO(MM(n)polylogn+SF_K(n)+R(A)) operations. The error probability can be pushed under a prescribed bound by repeating this procedure O(log¹) times independently, and taking the element which has minimal polynomial of maximal degree, see Lemma 4.2.

In the steps described in the rest of this section we assume that we are provided with an element u which generates a maximal torus T. We keep the notation introduced in Section 4.2 (C, S, H,N). We denote dim_KT byd.

4.5.3 Calculating C

We follow the method suggested by Theorem 4.12. First we calculate the subspace L = [A, A]∩T. The next two lemmas provide us with a tool for generating random elements of L.

Lemma 4.15. The map a 7→ J_s(Φ_Ta) is a linear map of A onto T and the map a 7→

J_n(Φ_Ta) is a linear map from A onto Rad(H). Furthermore, J_n(Φ_Ta) = a, for every a∈Rad(H), and J_s(Φ_Ta) = a, for every a∈T.

Proof. We know that Φ_T is a linear projection of A onto H. Also, Φ_TRad(A) = Rad(A) and J_s is zero on Rad(H). By Wedderburn–Malcev, H = T +N, a direct sum of vector spaces. Let π : H → T and µ : H → Rad(H) stand for projections corresponding to this decomposition. It remains to show that J_s and J_n (restricted to H) coincide with π

and µ, respectively. For every a ∈ H, π(a) is semisimple and µ(a) is nilpotent. Since H centralizes T, π(a) commutes with a and the same holds for µ(a) = a−π(a). From the uniqueness of the Jordan decomposition we infer that π(a) = J_s(a) and µ(a) = J_n(a).

Lemma 4.16. J_s(Φ_T[A, A]) = L= [A, A]∩T.

Proof. Since Js(ΦTa) = a for every a ∈ T it suffices to show that Js(ΦTa) ∈ [A, A] for every a∈[A, A]. By Corollary 2.1A=B+ Rad(A) (direct sum as vector spaces) for some semisimple subalgebra B ≤ A containing T. Since J_s(Φ_Ta) = 0 for every a ∈ Rad(A) it is sufficient to prove the assertion for the semisimple algebra B in place of A. By Lemma 4.7 we have T = C_B(T) hence J_s(Φ_Ta) = Φ_Ta for every a ∈ B. We know that Φ_Ta−a∈[T, B]⊆[B, B]. Hence Φ_Ta∈[B, B] if and only if a∈[B, B].

We calculate a basis of Lby generating sufficiently many random elements of the form J_s(Φ_T[a, b]).

Lemma 4.17. Let k ≤ dim_KL, 0 < , δ < 1, and let h ≥ kd(logk + log¹)/log¹_δe.

Assume that the elements a₁₁, b₁₁, . . . , a_1,h, b_1,h, . . . a_d1, b_d1, . . . , a_d,hb_d,h are chosen inde-pendently from A according to a probability distribution which satisfies the condition AlgRand(A,dim_KL, δ). Then with probability at least 1−, the set {J_s(Φ_T[a_ij, b_ij⁰])|i = 1, . . . , k, j, j⁰ = 1, . . . , h} contains at least k linearly independent elements of L.

Proof. Letl= dim_KL. By fixing aK-basisb₁, . . . , b_lofLwe identifyLwithK^l. For a tuple (y₁, z₁, . . . , y_k, z_k) ∈ A^2k let Y stand for the l×k matrix whose columns are J_s(Φ_T[y_i,z_i]) (i= 1, . . . , k). Let Γ be the family of all k-element subsets of{1, . . . , l}. For eachγ ∈Γ let f_γ(y₁, z₁, . . . , y_k, z_k) be the determinant of thek×k minor of Y which consists of the rows indexed by the elements of γ. Obviously f_γ is a multilinear function. We observe that all the functions f_γ (γ ∈Γ) vanish on a particular tuple (y₁, z₁, . . . , y_k, z_k)∈ A^2k if and only if the elements J_s(Φ_T[y_i, z_i]) (i= 1, . . . , k) are linearly dependent overK. By Lemma 4.16 this cannot be the case for every (y₁, z₁, . . . , y_k, z_k) ∈ A^2k and hence there exists at least oneγ ∈Γ such thatfγ is not identically zero. By Lemma 4.2 with probability at least 1− there exist indices j₁, . . . , j_k, j₁⁰, . . . , j_k⁰ such that f_γ(a₁j₁, b_1j⁰

1, . . . , a_kj_k, b_kj⁰

k)6= 0. Then the elements J_s(Φ_T[a_1j₁, b_1j⁰

1]), . . . , J_S(Φ_T[a_kj_k, b_kj⁰

k]) are linearly independent.

Like in Section 4.4, it will be convenient to perform calculations in T in terms of the basis 1, u, . . . , u^d−1. If it has not been done before we calculate the Frobenius normal form F rob(u) of u together with a transition matrix b such that b⁻¹ub = F rob(u) using Giesbrecht’s method with O(MM(n)polylogn) field operations. Then we can read the coordinates of an element z ∈ T in terms of the basis 1, u, . . . , u^d−1 from the first column of the first block of b⁻¹ub.

We find a basis ofLwithO(log¹dim_KL(MM(n)+R(A)+SF_K(n))polylogn) operations (even if dim_KL is not known a priori) as follows. Set h=d(logd+ log^d)/log¹_δe. For k = 1,2,4, . . . ,2^dlog²^de select a maximal linearly independent system from {Js(ΦT[aij, bij⁰])|i= 1, . . . , k, j, j⁰ = 1, . . . , h} where a_i, b_i are random elements of A chosen independently according to a distribution which satisfies AlgRand(A, d, δ). We stop if we obtained less thankelements, otherwise we proceed with 2kin place ofk. By the lemma, the probability that we stop with a system which does not generate L is at most .

Note that A/Rad(A) is commutative iff L = (0). Then C = T. Otherwise assume that we have a basis b1, . . . , bl of L. We choose linear function f1, . . . , fd−l : T → K such that LTd−l

i=1kerf_i. Then C = {z ∈ T|zL ⊆ L} = {z ∈ T|f_i(zb_j) = 0 (i = 1, . . . , l, j =

1, . . . , d−l)}whence we obtain a basis ofC by solving a system ofl(d−l) linear equations in d variables. This costs O(MM(d)l(d−l)/d) = O(dMM(d)) operations. Finally we find an elementu⁰ ∈C which generatesC as an algebra with identity by taking a random linear combination of these basis elements. (By Lemma 4.14, a random element ofCwill generate C. Note that we can verify whetheru⁰ generatesC with O(MM(d)polylogd) operations by testing linear independence of 1, u, . . . , u^dim^C−1.)

The total cost of the algorithm described in this subsection amounts to O(log¹d(MM(n) +R(a) +SF_K(n))polylogn) operations inK. IfA/Rad(A) happens to be commutative then O(log¹(MM(n) +R(a) +SF_K(n))polylogn) operations are sufficient.

4.5.4 Generating elements of N

Throughout this subsection we assume that we are provided with an element u⁰ which generates C as an algebra with identity.

Lemma 4.18. Assume that a₁, . . . , a_m generate A as an algebra with identity. Then the elements {[u⁰, a₁], . . . ,[u⁰, a_m]} generate AN A as an ideal of A.

Proof. Let J be the ideal generated by [u⁰, a₁], . . . ,[u⁰, a_m]. Obviously J ⊆ A[u⁰, A]A ⊆ A[C, A]A = AN A. Observe that u⁰ +J centralizes the generators ai +J of the factor algebra A/J. Hence [u⁰, A] ⊆ J and since C is generated by u⁰ we have [C, A] ⊆ J. By definition N = [C, A].

Hence generators of AN A can be calculated with O(mMM(n)) operations by taking [u⁰, g₁], . . . ,[u⁰, g_m].

4.5.5 Generating elements of Rad(H )

We generate elements of Rad(H) as follows. From a random element a ∈ A we first calculate Φ_Ta using the method described in Section 4.4. Then we compute the nilpotent part J_n(Φ_Ta) of Φ_Ta. The cost is O(MM(n)polylogn+SF_K(n)) operations. Note that because of the linearity of the map a 7→ Jn(ΦTa) (cf. Lemma 4.15) the method can be considered as a way to generate ,,random” elements of Rad(H). To be more specific, if we choose the element a according to a distribution satisfying AlgRand(A, D, δ) then the distribution of Jn(ΦTa) satisfies condition AlgRand(Rad(H), D, δ).

We are going to give an upper bound for the number of elements from Rad(H) which

— in addition to the generators of AN A — are sufficient to generate Rad(A) as an ideal.

The following elementary lemma is well known. A proof can be obtained by combining Corollary 4.1b of [80] and Lemma 4.2 of loc. cit..

Lemma 4.19. Let B be finite dimensional K-algebra and M ⊆Rad(B). Then Rad(B) = BM B if and only if Rad(B) = BM B+Rad(B)². In other words, the ideal generated by M is Rad(B) if and only if the same holds modulo Rad(B)².

Lemma 4.20. Assume that A/Rad(A) is a central simple K-algebra of dimension d². Then Rad(A) as an ideal of A can be generated by ddim_KRad(A)/d³e elements from Rad(H). Furthermore, A as an algebra with identity cannot be generated by less than ddim_KRad(A)/d⁴e elements.

Proof. Let ψ stand for the natural projection A →A/Rad(A)². Thenψ(T) is a maximal torus inA →A/Rad(A)². We haveC_ψ(A)ψ(T) = Φ_ψ(T₎(ψ(A)) =ψ(Φ_TA) = ψ(H). In view of this together with Lemma 4.19 it is sufficient to prove the assertion for A/(Rad(A))² in place of A. In other words, we may assume that Rad(A)² = (0). By Wedderburn–

Malcev, there exists a subalgebra D≤Asuch that A=D+ Rad(A) (direct sum as vector spaces). Assume that A is generated by a₁, . . . , a_m. Let a_i = b_i +c_i where b_i ∈ D and c_i ∈Rad(A). One easily verifies thatc₁, . . . , c_m generate Rad(A) as an ideal. On the other hand, since Rad(A)² = 0 we have Ac_iA = (D+ Rad(A))c_i(D+ Rad(A)) =Dc_iD, whence dim_KAc_iA≤(dim_KD)² =d⁴. This implies the inequality m≥ ddim_KRad(A)/d⁴e.

To prove the first assertion we use a refinement of the argument of the proof of The-orem 4.11. We consider Rad(A) as a D ⊗_K D-module in the natural way. Then ide-als of A contained in Rad(A) are exactly the D ⊗_K D-submodules and elements b of Rad(H) = Rad(A) ∩C_A(T) are characterized as (1 ⊗a)b = (a ⊗1)b for every a ∈ T. We know that D ⊗_K D ∼= M_d²(K) and Rad(A) as a D ⊗_K D-module is isomorphic to D^h, the direct sum of h copies of the simple D⊗_K D-module D (with the natural module structure). Here h = dim_KRad(A)/d². We claim that if a₁, . . . , a_r are linearly independent elements of D then (a₁, . . . , a_r) generates the D ⊗_K D-module D^r. This can be verified at once if we identify D⊗_K D with M_d²(K) and D with the standard M_d²(K)-module Kⁿ². Let r ≤ d and choose r linearly independent elements a₁, . . . , a_r from T. Then by the claim, b = (a₁, . . . , a_r) generates D^r as a D ⊗_K D-module and (1⊗a)b = (a₁a, . . . , a_ra) = (aa₁, . . . , aa_r) = (a⊗1)b. Hence dh/de generators of Rad(A) with the required property can be constructed by distributing the irreducible summands of Rad(A) into appropriate blocks and taking a single generator in each block.

Corollary 4.21. Assume that A as an algebra with identity is generated by m ele-ments. Suppose that the simple components of A/Rad(A) are Ae₁, . . . ,Ae_r with dimensions dim_KAe_i/dim_KZ(Ae_i) = d²_i. Then there exists a subset M ⊆Rad(H) of size at most

Max{Min(md_i,ddim_KA/d³_ie)|i= 1, . . . , r} ≤ d(dim_KA)¹⁴m³⁴e ≤ dn¹²m³⁴e such that A(M +N)A=Rad(A).

Proof. As in the proof of Lemma 4.20 we can assume that Rad(A)² = (0). ThenN² = (0) as well and by Proposition 4.8,N is an ideal ofA. HenceS ∼=A/N and Sis also generated by m elements. This means that for the rest of the proof we may further assume that N = (0), or, equivalently, A = S. By Proposition 4.10, A is a direct sum of subalgebras A₁, . . . , A_r, where A_i/Rad(A_i) ∼= Ae_i. Assume that M_i ⊆ Rad(H_i) = Rad(H)∩H_i such that A_iM_iA_i = Rad(A_i) (i = 1, . . . , r). It is easy to construct a set M ⊆ Rad(H) of cardinality Max|M_i| such that for every i ∈ {1, . . . , r} π_i(M) = M_i where π₁, . . . , π_r are the projections corresponding to the direct decomposition of A. It is immediate that such an M generates Rad(A) as an ideal. Hence it is sufficient to prove the assertion in the special case whereAe=A/Rad(A) is a simpleK-algebra. ThenC is a field inZ(A) and we can considerA as aC-algebra. The statement now follows from Lemma 4.20, applied toA as a C-algebra. (The bound independent of the d_is is obtained by taking an appropriate weighted geometric mean of mdi and dimKA/d³_i.)

The next lemma gives a bound on the random elements of Rad(H) which probably generate Rad(A) modulo the idealAM A. We omit the proof which is rather technical and can be carried out in a fashion similar to the proof of Lemma 4.17.

Lemma 4.22. Assume that there exists a subset M ⊆Rad(H) of size k such that A(M + N)A = Rad(A). Let 0 < , δ < 1 and h ≥ kd(logk + log ¹)/log ¹_δe. Assume that the elements a₁, . . . , a_h,∈ A are chosen independently according to a probability distribution satisfying AlgRand(A,dim_KRad(A), δ). Then with probability at least 1− the subspace N ∪ {J_n(Φ_Ta_i)|i= 1, . . . , h} generate Rad(A) as an ideal of A.

4.5.6 Computing Rad(A)

Here we summarize the algorithm for computing Rad(A) and conclude the proof of The-orem 4.3. The input consists of matrices g₁, . . . , g_m such that A is the matrix algebra generated by the identity matrix and g₁, . . . , g_m. We assume that random elements of a are generated independently according to a probability distribution satisfying condition AlgRand(A, n², δ) for a constant 0< δ <1, say 1/2. An error probability bound 0< < 1 is also given as a part of the input. We require that each of the three big steps which make use of randomization of the algorithm works correctly with probability at least 1− ₃.

First we find a semisimple matrix uwhich generates a maximal torusT by the method of Subsection 4.5.2. Then we calculate the subalgebra C ≤ T (and a generator u⁰ of C) using the method described in Subsection 4.5.3. If C = T then we set k = m otherwise k =dn¹²m³⁴e. Then we calculate the commutators [u⁰, g_i] (i= 1, . . . , m) as well asJ_n(Φ_Ta) for O(log ₃¹klogk) random elements a∈A. (The exact constant is given in Lemma 4.22.) These elements generate Rad(A) with probability at least 1−. This finishes the proof of Theorem 4.3.

In document Classical and quantum algorithms for algebraic problems (Pldal 38-43)