L istof Pub l icat ions - Audiotovisualspeechconversion GergelyFeldhoﬀer

Internationaltransactions

•Gergely Feldhoﬀer, Tam´as B´ardi: Conversionofcontinuousspeechsoundto articulationanimationasanapplicationofvisualcoarticulation modeling,Acta Cybernetica,2007

•Gergely Feldhoﬀer, Attila Tihanyi,Bal´azs Oroszi: Acomparativestudyofd i-rectand ASRbased modularaudiotovisualspeechsystems,Phonetician2010 (submitted)

Internationalconferences

•GyorgyTakacs, AttilaTihanyi,TamasBardi, Gergely Feldhoﬀer,Bal intSranc-sik: Database ConstructionforSpeechtoLip-readable Animation Conversion, Proceedings48thInternationalSymposiumELMAR,Zadar,2006

•G.Tak´acs, A.Tihanyi,T.B´ardi, G. Feldhoﬀer,B.Srancsik:SignalConversion from Natural AudioSpeechtoSynthetic VisibleSpeech,Int. Conf.onSignals andElectronicSystems,Lodz,Poland,September2006

•G.Tak´acs,A.Tihanyi,T.B´ardi,G.Feldhoﬀer,B.Srancsik:Speechtofacia lan-imationconversionfordeafapplications,14thEuropeanSignalProcessingConf., Florence,Italy,September2006.

•Tak´acs Gy¨orgy,TihanyiAttila,B´ardiTam´as,FeldhoﬀerGergely,: Feasibilityof Face Animationon MobilePhonesfor Deaf Users,Proceedingsofthe16stIST Mobileand WirelessCommunicationSummit,Budapest2007

•Gergely Feldhoﬀer,Bal´azsOroszi,Gy¨orgyTak´acs,AttilaTihanyi,Tam´asB´ardi: Inter-speakerSynchronizationin Audiovisual DatabaseforLip-readableSpeech to Animation Conversion,10thInternational Conferenceon Text,Speechand Dialogue,Plzen2007

•Gergely Feldhoﬀer,Tam´asB´ardi, Gy¨orgyTak´acsandAttilaTihanyi: Temporal AsymmetryinRelationsofAcousticandVisualFeaturesofSpeech ,15thEuro-peanSignalProcessingConf.,Poznan,Poland,September2007

•Tak´acs,Gy¨orgy;Tihanyi,Attila;Feldhoﬀer,Gergely;B´ardi,Tam´as;OrosziBal´azs: Synchronizationofacousticspeechdatafor machinelearningbasedaudiotov i-sualconversion,19thInternationalCongressonAcoustics, Madrid,2-7september 2007

•Gergely Feldhoﬀer:SpeakerIndependentContinuousVoicetoFacialAnimation on MobilePlatforms,PROCEEDINGS49thInternationalSymposiumELMAR, Zadar,2007.

Hungarianpublications

•B´ardi T.,Feldhoﬀer G., Harczos T.,Srancsik B.,Szab´o G. D: Audiovizu´alis besz´ed-adatb´azis´esalkalmaz´asai,H´ırad´astechnika2005/10

•FeldhoﬀerG.,B´ardiT.,JungG.,HegedˆusI. M.: Mobiltelefonalkalmaz´asoksiket felhaszn´al´oknak,H´ırad´astechnika2005/10.

•Tak´acsGy¨orgy,TihanyiAttila,B´ardiTam´as,FeldhoﬀerGergely,SrancsikB´alint: Besz´edjel´atalak´ıt´asa mozg´osz´ajk´ep´ev´esiketekkommunik´aci´oj´anakseg´ıt´es´ere, H´ırad´astechnika3.2006

•Tak´acsGy¨orgy,TihanyiAttila,B´ardiTam´as,FeldhoﬀerGergely,SrancsikB´alint: MPEG-4modellalkalmaz´asasz´ajmozg´asmegjelen´ıt´es´ere,H´ırad´astechnika8.2006

•FeldhoﬀerGergely, B´ardiTam´as: L´athat´obesz´ed: besz´edhangalap´ufejmodell anim´aci´osiketeknek,IV. MagyarSz´am´ıt´og´epesNyelv´eszeti Konferencia,Szeged, 2006.

B ib l iography

[1] K-ESpensB. Granstr¨om,I. Karlsson.Synface-aprojectpresentation. Procof Fonetik-TMH-QPSR,44:93–96,2002.1,1.2.1,2.5.1

[2] O. N. Garcia R. Gutierrez-OsunaP. Kakumanu, A.Esposito. Acomparisonof acousticcodingmodelsforspeech-drivenfacialanimation.SpeechCommunication, 48:598–615,2006.1.1.1

[3] V.LibalP.Scanlon, G.PotamianosandS. M. Chu. Mutualinformationbased visualfeatureselectionforlipreading.IninProc.ofICSLP,2004.1.1.1

[4] A.Robinson-MosherE.Sifakis, A.SelleandR.Fedkiw. Simulatingspeechwith aphysics-basedfacial muscle model.ACMSIGGRAPH/EurographicsSymposium onComputerAnimation(SCA),pages261–270,2006.1.1.1,1.2.6,1.5.2

[5] O. GarciaP. Kakumanu,A.EspositoandR. Guiterrez-Osuna. Acomparisonof acousticcodingmodelsforspeech-drivenfacialanimation.SpeechCommunication, 48:598–615,2005.1.1.1

[6]J.KewleyJ.Beskow,I.KarlssonandG.Salvi.Synface-atalkingheadtelephone forthehearing-impaired.Computers Helping People withSpecial Needs,pages 1178–1186,2004.1.1.1,6.1.2

[7] M. DeSmetS. Al Moubayedand H. Van Hamme. Lipsynchronization:from phonelatticetoPCAeigen-projectionsusingneuralnetworks.InProceedingsof Interspeech2008,Brisbane,Australia,Sep2008.1.1.1

[8] T.B´ardi-G.Feldhoﬀer Gy.Tak´acs,A.TihanyiandB.Srancsik.Speechtofacial animationconversionfordeafcustomers.In4thEuropeanSignalProcessingConf., Florence,Italy,2006.1.1.1,2.3.3,2.3.5,6.1.3,6.1.4

[9]J. Yamagishi G. Hoferand H.Shimodaira. Speech-drivenlip motiongeneration withatrajectory HMM.InProc.Interspeech2008,pages2314–2317, Brisbane, Australia,2008.1.1.1,1.2.2

[10] A.FusaroP. Cosiand G. Tisato. Luciaanewitaliantalking-headbasedona modiﬁedcohen-massaroslabialcoarticulation model.InProceedingsofEurospeech 2003,Geneva,Switzerland,September1-4,2003,volumeVol.III,pages2269–2272, 2003.1.2.3,2.2

76 BIBLIOGRAPHY

[11] K. MadanyandS.Fagel.Objectiveandperceptualevaluationofparameterizations of3D motioncapturedspeechdata.InProceedingsofAVSP,2008.1.2.3

[12]L.R.Rabiner. Atutorialonhidden markov modelsandselectedapplicationsin speechrecognition.InProceedingsoftheIEEE,volume77.1.2.4

[13]http://www.facegen.com.1.2.5,6.1.1

[14] M.OdisioandG.Bailly.Shapeandappearancemodelsoftalkingfacesformode l-basedtracking.InAVSP,pages105–110,StJorioz,France,2003.1.2.5

[15] HedvigKjellstr¨omandOlovEngwall.Audiovisual-to-articulatoryinversion.Speech Communication,51(3):195–209,2009.1.5.1

[16]S.A.KingandR.E.Parent. A3dparametrictongue modelforanimatedspeech. Journal OfVisualizationAndComputerAnimation,12:107–116,2001.1.5.1 [17]http://www.genarts.com.1.5.2

[18] P.Badin,G.Bailly,L.Rev´eret, M.Baciu,C.Segebarth,andC.Savariaux .Three-dimensionallineararticulatory modelingoftongue,lipsandface,basedon mri andvideoimages.JournalofPhonetics,30(3):533–553,2002.1.5.3

[19] A. Riecker, H. Ackermann, D. Wildgruber,J. Meyer, G. Dogil, H. Haider,and W. Grodd. Articulatory/phoneticsequencingattheleveloftheanteriorperisy l-viancortex: Afunctional magneticresonanceimaging(fmri)study. Brainand Language,75(2):259–276,2000.1.5.3

[20] R.Zunino D. Anguita, G.Parodi. AneﬃcientimplementationofBPon RISC-basedworkstations.Neurocomputing,6(1):57–65,1994.2.3.4

[21]http://digitus.itk.ppke.hu/˜flugi/johnnie.2.4

[22] M. M. Cohenand D. W. Massaro. Modelingcoarticulationinsyntheticv i-sualspeech.InModelsand TechniquesinComputer Animation,pages139–156. Springer-Verlag,1993.2.5.1

[23] T.KuratateandK.Kinoshita. Real-timetalkingheadsystembasedonprincipal componentanalysis. JournaloftheInstituteofImageElectronics Engineersof Japan,34:336–343,2005.2.5.2

[24] T. Harczos-B.Srancsik G.Szab´oT.B´ardi, G.Feldhoﬀer. Audiovizu´alisbesz´

6.2. LIST OF PUBLICATIONS 77

[27] T.¨OhmanandG.Salvi. UsingHMMsandANNsfor mappingacoustictovisual speech.TMH-QPSR,pages45–50,1999.2.5.2,3.4,3.1.4

[28] B.Theobald,S.Fagel,F.Elsei,andG.Bailly.LIPS2008: Visualspeechsynthesis challenge.InProceedingsofInterspeech,pages1875–1878,2008.3.1.1

[29] B. N´emeth P. Mihajlik, T.Fegy´oand V. Tr´on. Towardsautomatictranscr ip-tionoflargespokenarchivesinagglutinatinglanguages: HungarianASRforthe MALACHproject.In SpeechandDialogue:10thInternationalConference,Pilsen, CzechRepublic,2007.3.1.2

[30]Z. T¨uske P. Mihajlik, Z. Toblerand G. Gordos. Evaluationandoptimization ofnoiserobustfront-endtechnologiesfortheautomaticrecognitionofhungarian telephonespeech.InInterspeech2005-Eurospeech:9thEuropeanConferenceon SpeechCommunicationandTechnology,Lisboa,Portugal,2005.3.1.2

[31]http://alpha.tmit.bme.hu/speech/hdbMRBA.php.3.1.2

[32] T.R´ev´eszP. MihajlikandT.Tatai. Phonetictranscriptioninautomaticspeech recognition.ActaLinguisticaHungarica,pages407–425,2003.3.1.2

[33] F.Pereira M. Mohriand M. Riley. Weightedﬁnite-statetransducersinspeech recognition.ComputerSpeechandLanguage,pages69–88,2002.3.1.2

[34] L. CzapandJ. M´aty´as. Virtualspeaker.H´ırad´astechnikaSelectedPapers, Vol LX/6:2–5,2005.3.1.2,3.1.5

[35]http://avisynth.org.3.1.2

[36] A.S. House, C. Williams, M. H.L. Hecker,and K. D. Kryter. Psychoacoustic speechtests: A modiﬁedrhymetest. TheJournalofthe AcousticalSocietyof America,35(11):1899–1899,1963.3.1.4

[37] G.BretonG.Bailly,O.GovokhinaandF.Elisei.Atrainabletrajectoryformation model TD-HMMparameterizedforthelips2008challenge. In Proceedingsof Interspeech2008,Brisbane,Australia,Sep2008.3.1.5

[38] Gy.Tak´acsG.Feldhoﬀer,T.B´ardiandT.Tihanyi. Temporalasymmetryinre la-tionsofacousticandvisualfeaturesofspeech.In15thEuropeanSignalProcessing Conf.,Poznan,Poland,2007.3.1.5,4.2,6.1.2,6.2.3

[39] A.Tihanyi G.FeldhoﬀerandB. Oroszi. Acomparativestudyofdirectandasr based modularaudiotovisualspeechsystems(accepted).Phonetician,2010.3.2, 6.2.3

[40] A-Esposito O. N. Garcia A. Bojorquez J.L Castillo R. Gutierrez-Osuna, P.K. KakumanuandI. Rudomin. Speech-drivenfacialanimation withrealistic dynamics.IEEETransactionson Multimedia,7.4.1.1

78 SUMMARY

[41] O. N. Garcia R. Gutierrez-OsunaP. Kakumanu, A.Esposito. Acomparisonof acousticcodingmodelsforspeech-drivenfacialanimation.SpeechCommunication, 48.4.1.1

[42] G.Salvi.Truncationerroranddynamicsinverylowlatencyphoneticrecognition. InProcofISCAworkshoponNon-linearSpeechProcessing,2003.4.1.1

[43] V.LibalS. M.ChuP.Scanlon,G.Potamianos. Mutualinformationbasedvisual featureselectionforlipreading.InInt.Conf.onSpokenLanguageProcessing,2004. 4.1.1

[44] T.B´ardiG.FeldhoﬀerB.SrancsikG.Tak´acs,A.Tihanyi. Databaseconstruction forspeechtolipreadableanimationconversion.InELMARZadar,pages151–154, 2006.5.1.1

[45]L.RabinerandB.-H.Juang.Fundamentalsofspeechrecognition. Prentice-Hall, Inc.,UpperSaddleRiver,NJ,USA,1993.5.1.1

[46] G.Feldhoﬀer.Speakerindependentcontinuousvoicetofacialanimationonmobile platforms.In49thInternationalSymposiumELMAR,Zadar,Croatia,2007.5.2, 6.2.3

[47]J.ValinandC. Montgomery.Improvednoiseweightingincelpcodingofspeech -applyingthevorbispsychoacoustic modeltospeex.In120thConvention AES, Paris,France,2006.6,6.1.6

[48]http://www.bethsoft.com.6.1.1 [49] www.counter-strike.net.6.1.1

[50]http://www.thirdwishsoftware.com/ magpiepro.html.6.1.2

[51]J.P.Lewis, MattCordner,andNicksonFong. Posespacedeformation:auniﬁed approachtoshapeinterpolationandskeleton-drivendeformation.InSIGGRAPH

’00: Proceedingsofthe27thannualconferenceonComputergraph icsandinterac-tivetechniques,pages165–172,NewYork,NY,USA,2000.ACMPress/Add ison-WesleyPublishingCo.6.1.3

[52] G.Feldhoﬀerand B. Oroszi. Aneﬃcientvoicedrivenfaceanimation method forcybertelepresenceapplications.In2ndInternationalSymposiumon Applied SciencesinBiomedicalandCommunication Technologies,Bratislava,S lovakRe-public,2009.6.2,6.2.3

In document Audiotovisualspeechconversion GergelyFeldhoﬀer (Pldal 79-84)