Contents lists available atScienceDirect
Theoretical Computer Science
www.elsevier.com/locate/tcs
Generating clause sequences of a CNF formula
Kristóf Bérczi
a,∗ , Endre Boros
b, Ondˇrej ˇCepek
c, Khaled Elbassioni
d, Petr Kuˇcera
c, Kazuhisa Makino
eaMTA-ELTEEgerváryResearchGroup,DepartmentofOperationsResearch,EötvösLorándUniversity,Budapest,Hungary bMSISDepartmentandRUTCOR,RutgersUniversity,NJ,USA
cCharlesUniversity,FacultyofMathematicsandPhysics,DepartmentofTheoreticalComputerScienceandMathematicalLogic,Praha,Czech Republic
dEECSDepartment,KhalifaUniversityofScienceandTechnology,AbuDhabi,UnitedArabEmirates eResearchInstituteforMathematicalSciences(RIMS),KyotoUniversity,Kyoto,Japan
a rt i c l e i nf o a b s t ra c t
Articlehistory:
Received16February2020
Receivedinrevisedform23November2020 Accepted7December2020
Availableonline14December2020 CommunicatedbyL.M.Kirousis
Keywords:
CNFformulas Clausesequences Enumeration Generation
Given aCNFformula withclauses C1,. . . ,Cm and variables V= {x1,. . . ,xn},atruth assignmenta:V→ {0,1} ofleadstoaclausesequence
σ
(a)=(C1(a),. . . ,Cm(a))∈ {0,1}mwhereCi(a)=1 ifclauseCievaluatesto1 underassignmenta,otherwiseCi(a)=0.The setofallpossible clausesequencescarriesalotofinformationontheformula,e.g.
SAT, MAX-SATandMIN-SATcan beencodedintermsoffindingaclausesequencewith extremalproperties.
WeconsideraproblemposedatDagstuhlSeminar19211“EnumerationinDataManage- ment”(2019)about thegenerationofallpossible clausesequencesofagivenCNFwith bounded dimension. We prove that the problem can be solved in incrementalpolyno- mialtime. Wefurthergiveanalgorithmwithpolynomialdelayfortheclassoftractable CNFformulas.Wealsoconsiderthegenerationofmaximalandminimalclausesequences, andshowthatgeneratingmaximalclausesequencesisNP-hard,whileminimalclausese- quencescanbegeneratedwithpolynomialdelay.
©2020TheAuthor(s).PublishedbyElsevierB.V.Thisisanopenaccessarticleunderthe CCBYlicense(http://creativecommons.org/licenses/by/4.0/).
1. Introduction
Theconceptofwell-designedpatterntreeswasintroduced byLetelieretal.[1] asaconvenientgraphicrepresentationof conjunctive queries extendedbytheoptionaloperator.Thenodesofsucha treecorrespond tothequeries,whilethetree itselfrepresentstheoptionalextensions.Well-designedpatterntreeshavebeenstudiedfromacomplexitypointofviewin severalaspects.One ofthemostinteresting problemsinthecontext ofquerylanguagesisthegenerationproblem,that is, generatingthesolutionsoneaftertheotherwithoutrepetition.
Previouswork The generation problemwas studied forFirst-Order andConjunctive Queries [2–5] andfor well-designed patterntrees[1].Recently,Krölletal.[6] initiatedasystematicstudyofthecomplexity ofthegenerationproblemofwell- designed pattern trees. Theyidentified several tractable and intractable cases of the problem both from a classical and fromaparameterizedcomplexitypointofview.Oneclassofpatterntreeshoweverremainedunclassified.ForaclassC of
*
Correspondingauthor.E-mailaddresses:berkri@cs.elte.hu(K. Bérczi),endre.boros@rutgers.edu(E. Boros),cepek@ktiml.mff.cuni.cz(O. ˇCepek),khaled.elbassioni@ku.ac.ae (K. Elbassioni),kucerap@ktiml.mff.cuni.cz(P. Kuˇcera),makino@kurims.kyoto.ac.jp(K. Makino).
https://doi.org/10.1016/j.tcs.2020.12.021
0304-3975/©2020TheAuthor(s).PublishedbyElsevierB.V.ThisisanopenaccessarticleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/).
conjunctivequeries,awell-designedpatterntreeT isgloballyinCifforeverysubtreeTofT thecorrespondingconjunctive queryisalsoinC.ThetreewidthofaconjunctivequeryisthetreewidthofitsGaifman-graph[7].In[6],thecomplexityof thegeneration problemfortheclassofwell-designedpatterntreesfalling globally intheclass ofqueriesoftreewidth at mostkandhavingc-semi-boundedinterfacewasleftopen(see [6,Table1onpage16]).
AttheDagstuhlSeminar19211“EnumerationinDataManagement”,Kröllproposedanopenproblemonthegeneration of clause sequences of CNF formulas [8, Problem 4.7]. The problem is motivated by the fact that it can be reduced to the above mentioned unsolved caseof patterntrees, thus anybound on the generation complexity would be helpful in understandingthegeneralproblem.Agenerationalgorithmoutputstheobjectsinquestiononebyonewithoutrepetition.We callitapolynomialdelayprocedureifthecomputingtimebetweenanytwoconsecutiveoutputsisboundedbyapolynomial oftheinputsize.Wecallitincrementallypolynomial,ifforanykthefirstkobjectscanbegeneratedinpolynomialtime in theinputsizeandk.Finally,itiscalledtotalpolynomial ifall N objectsaregeneratedinpolynomialtime intheinputsize andN.
Theproblemstudiedinthispapercanbeformalizedasfollows.LetV = {x1
, . . . ,
xn}beasetofnBooleanvariables.We denoteby ¯xj=1−xj thenegation ofvariable xj.Variables andtheir negations together are calledliterals.A clauseisan elementarydisjunctionofliteralsandaCNFisaconjunctionofclauses.Weviewclausesalsoassubsetsoftheliterals,and CNFsassetsofclauses.GivenaCNF=C1∧ · · · ∧Cmandanassignmenta:V → {0
,
1},thecorrespondingbinarysequenceσ
(
a)
=(
C1(
a), . . . ,
Cm(
a))
iscalled a signature1 of,that is, Ci
(
a)
=1 if clause Ci evaluates to 1 under assignment a, andCi(
a)
=0 otherwise.Inparticular, thismeansthatissatisfiableifandonlyifthereexistssome assignmenta with
σ
(
a)
=(
1, . . . ,
1)
.Moreover,MAX-SATandMIN-SATcanbeencodedbyaskingforasignaturewiththelargestandsmallest sumofelements,respectively.As an example,consider theCNF formula
=C1∧C2∧C3∧C4,where C1=x1∨ ¯x3,C2= ¯x2, C3=x1∨x2∨x3 and C4=x2∨ ¯x3. Then assignment a1= {x1→1
,
x2→1,
x3→1} leads to signatureσ
(
a1)
=(
1,
0,
1,
1)
, while assignment a2= {x1→0,
x2→0,
x3→1}leadstosignatureσ
(
a2)
=(
0,
1,
1,
0)
.Itiseasytoseethathassixdifferentsignatures.In general,ifthenumberofsignaturesis
(
2n)
,thengeneratingthemintotalpolynomialtimeisnotdifficult.However,their numbermaybeo(
2n)
,presentingapotentialchallengeforgeneration.Given a CNF
=C1∧ · · · ∧Cm, the number of literals in clause Ci is denoted by |Ci|. We denote by dim
()
= maxi=1,...,m|Ci|,where |Ci| denotesthe numberofliteralsof Ci.Wecalla d-CNF ifdim
()
≤d.Thenumberofclauses andthenumberofliteralsappearinginare denotedby ||and,respectively.The occurrenceofavariableina CNF isthenumberofclausesinvolvingthatvariableoritsnegation.Vectorsarewrittenusingboldfontsthroughout,e.g. x.The problemaskedin[8] isford-CNFformulaswheredisafixedpositiveinteger,butwe alsoconsiderthesameproblemfor generalCNFs.
MotivatedbyMAX-SATandMIN-SAT,wealsoconsidermaximalandminimalsignatures.AsignatureofaCNF
iscalled maximal(resp.minimal)ifaninclusionwisemaximal(resp.minimal)subsetoftheclausestakesvalue1.
Ourresults Wegiveapolynomialdelay algorithmfortheclassoftractableCNFformulasinSection 2.Section 3discusses CNFswithboundeddimension.Fortheclassofformulaswithboundeddimensionandvariableoccurrences,wegiveanin- crementalpolynomialalgorithminSection3.1.InSection3.2,weshowthatG S
()
canbesolvedinincrementalpolynomial timeforformulaswithaboundeddimension,thusansweringtheopenproblemposedbyKröll.Thegenerationofmaximal andminimalsignaturesisconsideredinSection4.Finally,weconcludethepaperinSection5,wherea‘reversed’variantof theproblemisproposedasanopenquestion.2. TractableCNFs
Let
and
betwoCNFs.Iftheclausesof
arealsoclausesin
,then
iscalledasub-CNF of
,denotedby
⊆
. Wecalla familyofCNFstractableifforanyCNF
inthisfamilythesatisfiability ofanysub-CNFof
canbedecided in polynomialtimeevenafterfixinganysubsetofthevariablesatarbitraryvalues.Forexample,theclassesof2-CNFsorHorn CNFsaretractable.
Theorem1.If
belongstoatractablefamilyandhasm clauses,thenitssignaturescanbegeneratedwithpolynomialdelay.
1 WepreferthetermsignatureoverthetermclausesequenceproposedbyKröll,sinceitisabinarystring,notasequenceofclauses.Thereforeweuse thetermsignatureintherestofthepaper.
Proof. The idea is to apply the so-called‘flashlight’ approach in the signature space, using SAT as a ‘flashlight’ [9]. Let
=m
i=1Ci.We aregoing tobuilda binarytreeinwhichthepaths fromthe roottothe verticesofthetreecorrespond tobinary valuesofinitialsegments ofthesetofclauses,that is,C1
, . . . ,
Ck forsome 1≤k≤m.Thetreeisbuiltlayerby layer,eachnewlayercorrespondingtoanewclausebeingaddedtotheexaminedprefix.Thereexistsasignaturewiththis prefixifandonlyiftheCNFformedbytheclauses settovalue oneinthissequenceissatisfiableevenafteralltheforced fixingofvariablesthatappearinclauseswhosevalueiszero(notethataclausehasvalue 0 ifandonlyifalltheliteralsin itare0). IfsuchaCNFisnotsatisfiable,we backtrackanddonot explorethesubtreerootedatthisvertexasthereexists no signaturewiththisprefix.IftheCNF issatisfiable,we continuebuildingthe correspondingsubtree whichinthiscase isguaranteedtocontainatleastonesignature.Thealgorithmwillnotbacktrackabovethisvertexbeforeoutputting all(at least one)signaturesin thissubtree.It isnot difficult toverify thatafter atmost2m calls to SAT we canoutput a new signaturenotgeneratedbefore.Afteroutputtingthelastsignature,theprocedureterminatesafteratmostmSATcalls.Bytheabove,thesignaturesof
canbegeneratedwithadelayof O
(
m)
SAT-calls.Asbelongstoatractable family, thisimpliesapolynomialdelayalgorithm,concludingtheproofofthetheorem.
Remark2.LetusremarkthatthefamilyofmonotoneCNFsistractable,butforthiscasethereisamoreefficientpolynomial delaygenerationofthesignatures.Indeed,inthiscasewecanviewaclauseasasubsetofthevariables.Consequently,the set of zeros in a signature corresponds to a union ofclauses. It is easy to see that such unions can be generated with O
(
nm)
delay andO(
n)
averagedelay,wherenisthenumberofvariables,andm= ||isthenumberofclauses, see[10, Proposition11] and[11,Theorem15].NotethatinthiscaseTheorem1guaranteesonlyan O
(
m)
delay,becauseeverySATcallrequiresO( )
time.3. CNFswithboundeddimension 3.1. Boundedvariableoccurrence
GivenaCNF
,we denoteby H=
(,
E)
theconflictgraph of.Theverticesof H are theclauses of
andedges areexactlytheconflictingpairsofclauses,i.e.,pairs
(
Ci,
Cj)
forwhichthereexistsaliteralu∈Ci suchthatu¯∈Cj.Let S⊆
be a maximal independent set of H, andlet L
(
S)
=Ci∈SCi denote the setof literals appearing in the clauses of S.Wedefine apartial assignmentaS:L
(
S)
→ {0,
1}bysettingall literalsofL(
S)
tozero(andhencethecom- plementaryliteralsaresetto 1).The signatureassociatedtoS isthendefinedasσ
(
S)
:=σ
(
aS)
=(
y1, . . . ,
ym)
∈ {0,
1}m. The coordinatesofσ
(
S)
are well-definedas yi=0 ifandonlyifCi∈S fori=1, ...,
m.We willdismissthesubscriptwhenever theCNF inquestion isclear fromthecontext. Notethat fordifferentmaximal independentsets S=S of H wehave
σ (
S)
=σ (
S)
.ItisworthmentioningthatallmaximalindependentsetsofH canbegeneratedwithpolynomial delay[12–14],whichishenceagoodstartforCNFsignaturegeneration.Assumethat
hasboundeddimension,i.e.,foraconstantdwehave|Ci|≤dforalli=1
, ...,
m.LetusdefineXj= {Ci∈|xj∈Ci orx¯j∈Ci}.Wesaythat
isof
ω
-boundedoccurrenceif|Xj|≤ω
for j=1, ...,
nandω
isafixedconstant.Theorem3.If
hasboundeddimensionandoccurrence,thenitssignaturescanbegeneratedinincrementalpolynomialtime.
Proof. An inducedmatching of an undirected graph isa matching which forms an induced subgraph,that is,no two of its edges are joined by an edge ofthe graph.A maximal induced matching canbe found by a simple greedyapproach:
repeatedlyselectanedgee,addittothesolution,anddeletetheend-verticesoftheedgetogetherwiththesetofvertices adjacenttoatleastoneofthem.
Let M⊆E be a maximal induced matchingin H.Let usdenoteby
μ
thenumber ofedges in M andby N the 2μ
verticesincidenttoedgesinM.Notethat H hasatleast2μmaximalindependentsets(bychoosingarbitrarilyoneofthe endpointsfromeachedgeofM andthengreedilyextendingthissettoamaximalindependentset)andhenceatleastthis manysignaturescanbegeneratedwithpolynomialdelay,asexplainedabove.WedenotebyW⊆
thesetofclausesthat haveedgesin H connectingthemtosomeoftheclausesinN,formally
W
= {
C∈ | ∃
C∈
N: (
C,
C) ∈
E}
NotethatN⊆W.Moreover,letusdenoteU=
\W.Itiseasytoseethat U isanindependentsetinH,sinceanyedge withbothendpointsinU couldbeusedtoenlargeM,thuscontradictingitsmaximality.
Assumethat|Ci|≤dforalli=1
, ...,
m,and|Xj|≤ω
forall j=1, ...,
n,wheredandω
arefixedconstants.Observethat withtheseassumptions allclauses inN containtogether atmost2μ
dliteralsandsowe have|W|≤2μ
dω
.Let K be the setofvariablesinvolvedinclausesofW andletusdenoten= |K|.Nowwehaven≤d|W|,implyingn
≤
2μ
d2ω .
Finally,letusdenotebyLthe(possiblyempty)setofvariablesthatappearonlyinclausesofU.Clearly,allvariablesinLare monotonein
(somevariablesappearonlypositivelywhilesomeothersappearonlynegatively)sinceU isanindependent setinH.Thusallliteralsin
thatcorrespondtovariablesinLcanbesimultaneouslyassignedzero.
Theprocedurethatgeneratesallsignaturesof
worksinthreesteps.Inthefirststep,allliteralsinLaresetto0 and theresultingCNFinnvariablesfromK isdenoted
.Thenwegeneratewithpolynomialdelaythemaximalindependent sets S,
=1
, ...,
kofH (see[12–14]),andthecorrespondingsignaturesσ (
S)
,=1
, ...,
kof,wherek≥2μ(notethat asignatureof
inthiscaseextendsuniquelytoasignatureof
byaddingzerosforclausesthatconsistonlyofvariables fromL).
Inthesecond stepwegeneratealltheremainingsignaturesstemming fromassignments whereall literalsin Lare set tozero.Wetryall2n binaryassignments tothevariablesinK andcheckforeachofthemwhetheranewsignaturewas generated.Note that k≥2μ≥
(
2n)
1/2d2ω using theinequality n≤2μ
d2ω
,which inturnimplies 2n≤k2d2ω.Hence this step canbe doneinan incremental polynomialtime, inparticularin O(
mnk2d2ω)
time sincegeneratinga signaturefora givenassignmenttakesO(
mn)
time.Forthethirdstepletusassumethatk≥kdistinctsignaturesweregeneratedinthefirstandsecondstep.Byswitching the assignmentto literalsin L,we mayget newsignatures,resultingfromchanging some ofthe zerosin asignature to one. Forany partial assignmentto the variables in K we obtain a monotone CNF on variablesin L, andhencethis isa set-uniongenerationproblemthatcanbesolvedwithpolynomialdelayasobservedintheprevioussection.Sinceitsuffices toconsideronlythosepartialassignmentstothevariablesin K whichproducedasignatureinthefirsttwosteps,wemay getinthiswaythesamesignaturemultipletimes,butnomorethanktimes,andthusatthissteptheadditionalsignatures arealsogeneratedinincrementalpolynomialtime.
3.2. Unboundedoccurrence
Intheprevioussection,weconsideredCNFswithboundeddimensionandoccurrence.Therunningtimeofthealgorithm providedby Theorem3dependsexponentially on
ω
,henceitisnotsuitableforhandlingthegeneralcase. Inthepresent section,amoregeneralprocedureisgivenbasedonadifferentapproach.ForaCNF
,wedenoteby G=
(,
E)
thesocalleddualgraphof[15].TheverticesofG aretheclausesof
and edgesareexactlythepairsofclauses
(
Ci,
Cj)
forwhichthereexistsavariablethatoccursinbothCiandCj(complemented ornot).If S⊆isanindependentsetofG,thentheclausesofShavepairwisedisjointsetsofvariablesinvolved.
Theorem4.ThereexistsanalgorithmAthatgeneratesthesignaturesofaCNF
consistingofm clausesinn binaryvariablesin O
(
dm2nk(
d2))
totaltime,whered=dim()
andk isthenumberofsignatures.Proof. Weprovetheclaimbyinductionond.Ford≤2 theclaimfollowsbyTheorem1.
Assume now thatwe already proved theclaim forall d
<
d, andlet usconsidera CNF=C1∧C2∧ · · · ∧Cm with dim
()
=d.LetusassociatetoitsdualgraphGasdefinedabove.LetS⊆
beamaximalindependentsetofG.Such a setcan be obtainedby asimple greedyprocedure inpolynomial time inthe sizeof
. Notethat clauses in S involve pairwisedisjointsetsofvariables,duetothefactthat S isanindependentsetofG.Thus,wecanchoosealiteral uC∈C foreach clause C∈S,set allother literalsin C tozero,set allother variables not occurringinclauses of S tozero,and makeallpossibletruthassignmenttotheliteralsuC,C∈S.Thiswayweobtaink0=2|S| differentbinary signaturesof
. Notethatwecanoutputthesek0 signatureswithpolynomialdelay.
The totalnumberofvariablesinvolvedinclauses of S isn≤d|S|.Hence wecan assigninall possiblewaysvaluesto thesevariables,andproduce2n≤kd0 subproblems
j, j=1
, ...,
2n intheremainingvariablesinO(
mn2n)
=O(
mnkd0)
time whichispolynomialintheinputsizeandk0,sincedisafixedconstant.Notethateachofthesesubproblemsisobtained fromby fixingthen variables atabinary assignment,andsucha substitutioncan bedone in O
(
mn)
time. Notealso, that each ofthese residualproblems isofdimension atmostd−1. Indeed,each ofthe clauses not in S shares atleast onevariablewiththeclausesofS,since SisamaximalindependentsetofG,andnowthatsharedvariableisfixedata binaryvalue.Weapply algorithmAtoeach oftheresidualsub-CNFs
j, j=1
, ...,
2n,onebyone. Thiswaywe producesignatures thatextendthepatternon Sdefinedbyxj∈ {0,
1}n,forall j=1, ...,
2n onebyone.Twooftheseextendedsignaturesmay coincide,butonlyifthey areextensionsoftwodifferentxj-s, sincefora fixedxj algorithmAgeneratespairwisedistinct extensions.Thus, we mayproducethesamesignature no morethan 2n times.Since 2n=O(
kd0)
,we canshow that this procedureworksintotalpolynomialtime.Toseethisletusintroducesomeadditionalnotation.Wedenoteby Xj⊆Y= {0
,
1}n, j=1, ...,
2|S|thenonemptysetsof (partial)assignmentsthatproducethesamesignatureontheclausesof S.Notethatthe Xj-spartitionthesetY ofpartial truth assignments.For x∈Y,let usdenoteby(
x)
the residualCNF,andby k(
x)
the numberofsignaturesof(
x)
.We denotebyg()
therunningtimeoftheabovedescribedrecursivealgorithmonCNFandletG
(
m,
n,
d,
k)
bethemaxima of g()
overallCNFswithatmostmclausesonnvariableshavingdim()
≤dandhavingatmostksignatures.The totalcomputational timeinthe firstphase oftheabove procedurethat endswithproducinga listof2n residual CNFs,eachofdim≤d−1 isboundedby O
(
m2n)
+O(
mnk0)
+O(
mnkd0)
≤K m2nkd0 forasuitableconstant K thatdoesnot depend onm,n,andk0.Thefirst termontheleft handsideisthetime tobuild G andtofindamaximal independent set S.Thesecondtermisthetimeweneedtogeneratethek0 initialsignatures.Thethirdtermisthetimetogeneratethe 2n≤kd0subproblems.For x∈Xj andx∈Xj with j=j the CNFs
(
x)
and(
x)
cannot sharesignatures,since thosemust alreadydiffer on S bythedefinitionofthesets Xj for j=1, ...,
k0.However, forx,
x∈Xj CNFs(
x)
and(
x)
mayshare(some,even many)signatures.Discountingtheonesignature wealreadyproducedwiththegiven0-1 values on S,we canstillexpect kj different signaturesproduced by algorithmA whenwe use itforCNFs(
x)
,x∈Xj, wheremaxx∈Xj[k(
x)
−1]≤kj≤x∈Xj[k
(
x)
−1].Thus,intotalwegetk = k0+k1+ · · · +k2|S| differentsignaturesfor.ThetotalrunningtimeonCNFs
(
x)
,x∈Xjcanbeboundedbyx∈Xj
g
((
x)) ≤ |
Xj|
G(
m,
n,
d−
1,
kj).
Thus,forthetotalrunningtimeofalgorithmAon
weget
g
() ≤
G(
m,
n,
d,
k) ≤
K m2nkd0+
k0
j=1
|
Xj|
G(
m,
n,
d−
1,
kj)
≤
K m2nkd0+
kd0G(
m,
n,
d−
1,
k),
whereforthelastinequalityweusedkj≤kforall j=1
, ...,
k0,implyingG(
m,
n,
d−1,
kj)
≤G(
m,
n,
d−1,
k)
,whichallows thisquantity tobe factoredout ofthe sum,that canbe then upperboundedby k0j=1|Xj|=2n≤kd0.Using thiswecan showbyinductionondthatG
(
m,
n,
d,
k)
≤Ldm2nk(
d2)
forsomeconstantL(wewillchoose L≥K)whichwillcompletethe proofofourclaim.NowG
(
m,
n,
d,
k) ≤
K m2nkd0+
kd0G(
m,
n,
d−
1,
k)
≤
K m2nkd0+
kd0L(
d−
1)
m2nk(
d−21)
≤
Lm2nkd+
kdL(
d−
1)
m2nk(
d−21)
≤
Lm2nkd+
L(
d−
1)
m2nk(
d−21)
+d≤
Ldm2nk(
d2).
Remark5.Since2-CNFsaretractable,therunningtimeofthealgorithmcanbeslightlyimprovedbystoppingtherecursion whend=2,aspointedoutbyStrozecki[16].
Corollary6.Thealgorithmconstructedintheaboveproofworksinincrementalpolynomialtime.
Proof. Usingtheabovetheorem,wecanprovethisclaimbyinductiononthedimensiond.Whend=1,theclaimistrivially true.
Considernowthegeneralcase,asintheproofoftheabovetheorem.Asweremarkedthere,producingthefirstk0=2|S| signaturesinfactcanbedonewithpolynomialdelay.AfterthiswestartprocessingtheCNFs
(
x)
forx∈Xj, j=1, ...,
k0. Notethatthesignaturesproducedfrom(
x)
,x∈Xjand(
x)
,x∈Xjarealldifferentif j=j.Notealsothatdim((
x))
≤ d−1 forallx∈Xj, j=1, ...,
k0,andthuswecanassumebyinductionthattheirsignaturescanbeproducedinincremental polynomial time in thesize of(
x)
,whichis boundedby the size of.Thus, if Xj= {x1
, ...,
x},then we can produce k(
x1)
newsignaturesinincremental polynomialtime,infactregardlesshowmanyweproduced previously(includingthe k0 wehavefromthefirstphase). Letusdenotebyq(
m,
n,
k(
x1))
thepolynomialboundingthetotaltimeprocessing(
x1)
.If k(
x2) >
k(
x1)
,thenmaybethefirstk(
x1)
signaturesproducedfrom(
x2)
coincidewiththeoneswealreadygeneratedfrom(
x1)
,butstillafteratmostq(
m,
n,
k(
x1))
timewegetanewsignature.Intheworstcase,wehavekj=k(
x1)
≥k(
xi)
forall xi∈Xj,i=1,inwhichcaseprocessing(
xi)
,i=2, ...,
maynotproduceanynewsignatures.Since≤kd0,thismeansthat thelargestgapbetweentheoutputofthelastsignatureof
(
x1)
andnextnewsignatureisnotmorethankd0q(
m,
n,
k(
x1))
, atamomentwhenwehavealreadyproducedk≥k0+k(
x1)
signatures.Thusthislargesttimegapbetweentwooutputsis stillboundedbyapolynomialofm,n,andthenumberofsignaturesk≥k0+k(
x1)
producedsofar.Asbothn andmare boundedbytheinputsize,thecorollaryfollows.
4. Generatingmaximalandminimalsignatures
GenerationofmaximalsignaturesisdifficultasitincludesSATasaspecialcase.
Theorem7.UnlessP=NP,themaximalsignaturescannotbegeneratedintotalpolynomialtime.
Proof. Letusconsider aCNF
,and observethat its unique maximal signature isthe all-onevector ifandonly if
is satisfiable.Assumebycontradictionthatwehaveatotalpolynomialtimealgorithmforgeneratingallmaximal signatures,
anddenote byt
(
k, )
thepolynomial bound forits terminationwhen theinput hassize k andoutput involves exactlysignatures.Letusrunthisalgorithmfort
(||||,
1)
time,whichispolynomialintheinputsizeforinput.Ifthisalgorithm outputs the all-onevector then
is satisfiable, andotherwise itis not. Hence itwould decidethe satisfiability of
in polynomialtime.AsSATisNP-complete[17],thetheoremfollows.
Itturnsoutthatminimalsignaturescanbegeneratedefficiently.
Theorem8.MinimalsignaturescanbegeneratedwithpolynomialdelayforarbitraryCNFformulas.
Proof. Weclaimthatthereisaone-to-onecorrespondencebetweenminimalsignaturesofaCNF
andmaximalindepen- dent setsofitsconflict graph H.Since H canbe builtinpolynomialtime from
andmaximal independentsetsofa graphcanbegeneratedwithpolynomialdelay[12–14],thiswouldprovethetheorem.
Tosee theabove claim,assume first thata signature
σ
= {σ
C|C∈}
isa minimal signature of.Notethat the set S= {C∈
|
σ
C=0} is an independent set in H. For any C∈with
σ
C=1 there must exist a conflict between C andsome C∈S,sinceotherwisewe couldsetσ
C tozerowithoutforcing anyoftheclauses in S tochangetheir values, contradictingtheminimalityofσ
.Thus S mustbeamaximalindependentset.TheotherdirectionfollowsfromthefactthatifS isamaximalindependentsetofHandwesetalltheclausesinS to zero,thenallotherclausesof
areforcedtotakevalueoneduetotheconflictsbetweenS andotherverticesofH. 5. Conclusions
In thispaperweshow that all signaturesofagiven CNFwitha boundeddimensioncan be generatedinincremental polynomialtime,answeringanopenproblemposedbyKröll[8,Problem4.7].Afasterincrementalpolynomialalgorithmis providedfortheclassofformulaswhereboththedimensionandtheoccurrencearebounded.Moreover,itisalsoshown thatthesametaskcanbedonewithpolynomialdelay iftheinputCNFisfromatractableclass(inthiscasenoboundon dimensionoroccurrenceisnecessary).Finally,itisprovedthatmaximalsignaturescannotbegeneratedintotalpolynomial timeunless P=N P,whileminimalsignaturescanbegeneratedwithpolynomialdelayforarbitraryCNFformulas.
Inthiscontextitisinterestingtonote thatgivena3-CNF
withmclauses andthevector y=
(
1,
1, ...,
1)
∈ {0,
1}m it isNP-hardtotestwhether y isasignatureof,ornot(y isasignatureifandonlyif
issatisfiable).Ontheotherhand, ourresultsshowthatgeneratingallsignaturesof
canbe doneinincrementalpolynomialtime.Thisisaratherunusual behaviorforagenerationproblem.Typically,ifallsolutionsofagivenproblemcanbegeneratedinincrementalpolynomial time,checkingifagivencandidateisasolutionornotiscomputationallyeasy.
An additionalproblemrelatedtoCNF signatureswasstatedattheDagstuhlSeminar19211byTurán.Givena set S⊆ {0
,
1}m,doesthere exista CNF withm clauses such that S is exactlyits set ofall signatures? Ifyes,can such aCNF be computedefficiently?This‘reverse’problem(getthesignatures,outputclauses)totheproblempresentedinthispaper(get theclauses,outputsignatures)istothebestofourknowledgecompletelyopen.Declarationofcompetinginterest
The authors declare that they haveno known competingfinancial interests or personal relationships that could have appearedtoinfluencetheworkreportedinthispaper.
Acknowledgements
We are truly grateful forYann Strozecki forhis valuable observations andsuggestions that helped usto improvethe paper.Wewouldalsoliketothanktheanonymousrefereesfortheircarefulreadingofthemanuscriptandtheirinsightful comments.
Kristóf Bérczi was supported by the János Bolyai Research Fellowship of the Hungarian Academy of Sciences and by the ÚNKP-19-4 New National ExcellenceProgram ofthe Ministry forInnovation and Technology.Ondˇrej ˇCepek andPetr Kuˇcera gratefully acknowledge a support by the Czech Science Foundation (Grant 19-19463S). Projects no. NKFI-128673 and“ApplicationDomainSpecificHighlyReliableITSolutions”havebeenimplementedwiththesupportprovidedfromthe NationalResearch,DevelopmentandInnovation Fund ofHungary,financedunderthe FK_18andtheThematicExcellence ProgrammeTKP2020-NKA-06(NationalChallengesSubprogramme)fundingschemes,respectively.Thisworkwassupported bytheResearchInstituteforMathematicalSciences,anInternationalJointUsage/ResearchCenterlocatedinKyotoUniversity.
References
[1]A.Letelier,J.Pérez,R.Pichler,S.Skritek,Staticanalysisandoptimizationofsemanticwebqueries,ACMTrans.DatabaseSyst.(TODS)38 (4)(2013)25.
[2]A.A.Bulatov,V.Dalmau,M.Grohe,D.Marx,Enumeratinghomomorphisms,J.Comput.Syst.Sci.78 (2)(2012)638–650.
[3]A.Durand,N.Schweikardt,L.Segoufin,Enumeratinganswerstofirst-orderqueriesoverdatabasesoflowdegree,in:Proceedingsofthe33rdACM SIGMOD-SIGACT-SIGARTSymposiumonPrinciplesofDatabaseSystems,ACM,2014,pp. 121–131.
[4]W.Kazana,L.Segoufin,Enumerationoffirst-orderqueriesonclassesofstructureswithboundedexpansion,in:Proceedingsofthe32ndACMSIGMOD- SIGACT-SIGAISymposiumonPrinciplesofDatabaseSystems,ACM,2013,pp. 297–308.
[5]L.Segoufin,Enumeratingwithconstantdelaytheanswerstoaquery,in:Proceedingsofthe16thInternationalConferenceonDatabaseTheory,ACM, 2013,pp. 10–20.
[6]M.Kröll,R.Pichler,S.Skritek,Onthecomplexityofenumeratingtheanswerstowell-designedpatterntrees,in:19thInternationalConferenceon DatabaseTheory,ICDT2016,SchlossDagstuhl-Leibniz-ZentrumfuerInformatik,2016.
[7]J.-L.Guigues,V.Duquenne,Famillesminimalesd’implicationsinformativesrésultantd’untableaudedonnéesbinaires,Math.Sci.Hum.95(1986)5–18.
[8]E.Boros,B.Kimelfeld,R.Pichler,N.Schweikardt,Enumerationindatamanagement(Dagstuhlseminar19211),SchlossDagstuhl-Leibniz-Zentrumfuer Informatik,2019.
[9]E.Boros,K.Elbassioni,V.Gurvich,AlgorithmsforGeneratingMinimalBlockersofPerfectMatchingsinBipartiteGraphsandRelatedProblems,Lec- tureNotesinComputerScience(includingsubseriesLectureNotesinArtificialIntelligenceandLecture NotesinBioinformatics),vol. 3221,2004, pp. 122–133.
[10]Y.Strozecki,A.Mary,Efficientenumerationofsolutionsproducedbyclosureoperations,Discret.Math.Theor.Comput.Sci.21 (3)(2019)1–30.
[11] F.Capelli,Y.Strozecki,EnumeratingmodelsofDNFfaster:breakingthedependencyontheformulasize,DiscreteAppl.Math.(2020),https://doi.org/ 10.1016/j.dam.2020.02.014.
[12]S.Tsukiyama,M.Ide,H.Ariyoshi,I.Shirakawa,Anewalgorithmforgeneratingallthemaximalindependentsets,SIAMJ.Comput.6 (3)(1977)505–517.
[13]D.S.Johnson,M.Yannakakis,C.H.Papadimitriou,Ongeneratingallmaximalindependentsets,Inf.Process.Lett.27 (3)(1988)119–123.
[14]K.Makino,T.Uno,Newalgorithmsforenumeratingallmaximalcliques,in:ScandinavianWorkshoponAlgorithmTheory,Springer,2004,pp. 260–272.
[15] M.Samer,S.Szeider,Algorithmsforpropositionalmodelcounting,J.Discret.Algorithms8 (1)(2010)50–64,https://doi.org/10.1016/j.jda.2009.06.002.
[16] Y.Strozecki,personalcommunication,2020.
[17]S.A.Cook,Thecomplexityoftheorem-provingprocedures, in:Proceedingsofthe ThirdAnnual ACMSymposiumonTheoryofComputing,1971, pp. 151–158.