* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Year-3 Slides (Eunice Ma)
Georgian grammar wikipedia , lookup
Yiddish grammar wikipedia , lookup
Germanic weak verb wikipedia , lookup
Junction Grammar wikipedia , lookup
Serbo-Croatian grammar wikipedia , lookup
Cognitive semantics wikipedia , lookup
Old English grammar wikipedia , lookup
Kagoshima verb conjugations wikipedia , lookup
CONFUCIUS: An Intelligent MultiMedia Storytelling Interpretation and Presentation System Minhua Eunice Ma Supervisor: Prof. Paul Mc Kevitt School of Computing and Intelligent Systems Faculty of Engineering University of Ulster, Magee Outline         Related research Overview of CONFUCIUS Automatic generation of 3D animation Semantic representation Natural language processing Current state of implementation Relation to other work Conclusion & Future work Faculty Research Student Conference Jordanstown, 15 Jan 2004 Related research  3D visualisation  Virtual humans & embodied agents: Jack, Improv, BEAT  MultiModal interactive storytelling: AesopWorld, KidsRoom, Larsen & Petersen’s Interactive Storytelling, computer games  Automatic Text-to-Graphics Systems: WordsEye, CD-based language animation  Related research in NLP     Lexical semantics Levin’s verb classes Jackendoff’s Lexical Conceptual Structure Schank’s scripts Faculty Research Student Conference Jordanstown, 15 Jan 2004 Objectives of CONFUCIUS Storywriter /playwright Movie/drama script CONFUCIUS 3D animation User /story listener  To interpret natural language sentences/stories and to extract conceptual semantics from the natural language  To generate 3D animation and virtual worlds automatically from natural language  To integrate 3D animation with speech and non-speech audio, to form an intelligent multimedia storytelling system Faculty Research Student Conference Jordanstown, 15 Jan 2004 Architecture of CONFUCIUS Natural language stories Script writer Script parser Prefabricated objects (knowledge base) LCS lexicon Natural grammar Language knowledge mapping Language Processing 3D authoring tools, existing 3D models & character models visual knowledge (3D graphic library) Text To Speech Sound effects semantic representations visual knowledge Animation generation Synchronizing & fusion 3D world with audio in VRML Faculty Research Student Conference Jordanstown, 15 Jan 2004 Software & Standards  Java    parsing semantic representation changing VRML code to add/modify animation integrating modules  Natural language processing tools   Connexor Machinese DFG parser (morphologic and syntax parsing) WordNet (lexicon, semantic inference)  3D graphic modelling   Existing 3D models (virtual human/object) on Internet Authoring tools     Humanoid characters: Character Studio Props & stage: 3D Studio Max Narrator: Microsoft Agent Modelling language & standard   VRML 97 for modelling geometry of objects, props, environment H-Anim specifications for humanoid modelling Faculty Research Student Conference Jordanstown, 15 Jan 2004 Agents and Avatars—How much autonomy?  Autonomous agents have higher requirements for sensing, memory, reasoning, planning, behaviour control & emotion (sense-emotioncontrol-action structure)  “User-controlled” avatars require fewer autonomous actions-- basic naïve physics such as collision detection and reaction still required  Virtual character in non-interactive storytelling between agents and avatars--its behaviours, emotion, responses to changing environment described in story input Virtual humans: Autonomy & intelligence: avatars characters in non-interactive storytelling interface agents low Faculty Research Student Conference Jordanstown, 15 Jan 2004 autonomous agents high Graphics library objects/props characters Simple geometry files geometry & joint hierarchy Files (H-Anim) instantiation motions animation library (key frames) Faculty Research Student Conference Jordanstown, 15 Jan 2004 Level of Articulation (LOA) of H-Anim  CONFUCIUS adopts LOA1 in human animation  animation engine adds ROUTEs dynamically based on H-anim’s joints & animation keyframes  CONFUCIUS’ human animation adapted for other LOAs. pushing objects holding objects Joints and segments of LOA1 Example site nodes on hands Faculty Research Student Conference Jordanstown, 15 Jan 2004 Semantic representations Categories Knowledge representations rule-based representation Decomposite FOPC (First Order Predicate Calculus) general knowledge representation & reasoning Typical applications expert systems semantic networks sentence representation, expert systems lexical semantics Schank’s scripts story understanding frame-based representations XML-based representations multimodal semantics Conceptual Dependency (CD) event-logic truth conditions physical knowledge representation & x-schema and f-structure reasoning (inc. Lexical-Conceptual Structure spatial /temporal (LCS) reasoning) Lexical Visual Semantic Representation (LVSR) Faculty Research Student Conference Jordanstown, 15 Jan 2004 dynamic vision (movement) recognition & generation Lexical Visual Semantic Representation  Lexical Visual Semantic Representation (LVSR): semantic representation between language syntax and 3D models  LVSR based on Jackendoff’s LCS adapted to task of language visualization (enhancement with Schank’s scripts)  Ontological categories: OBJ, HUMAN, EVENT, STATE, PLACE, PATH, PROPERTY      OBJ -- props/places (e.g. buildings) HUMAN -- human being/other articulated animated characters (e.g. animals) as long as their skeleton hierarchy is defined EVENT -- actions, movements and manners STATE -- static existence PROPERTY -- attributes of OBJ/HUMAN Faculty Research Student Conference Jordanstown, 15 Jan 2004 PATH & PLACE predicates  interpret spatial movement of OBJ/HUMANs  62 common English prepositions  7 PATH predicates & 11 PLACE predicates PATH predicates Direction feature Termination feature to from toward away_from via across along 1 0 1 0 n/a n/a n/a 1 1 0 0 0 n/a n/a PLACE predicates at behind end_of in in_front_of near on out over top_of under Faculty Research Student Conference Jordanstown, 15 Jan 2004 contact/attach feature unmarked <-contact> n/a unmarked <-contact> <-contact> <+contact> unmarked <-contact> n/a unmarked NLP in CONFUCIUS Pre-processing Part-of-speech tagger Connexor FDG parser Syntactic parser Semantic inference WordNet LCS database Disambiguation FEATURES Morphological parser Coreference resolution Temporal reasoning Lexical temporal relations Faculty Research Student Conference Jordanstown, 15 Jan 2004 Post-lexical temporal relations Visual valency & verb ontology 2.2.1. Human action verbs 2.2.1.1. One visual valency (the role is a human, (partial) movement) 2.2.1.1.1. Biped kinematics: arm actions (wave, scratch), leg actions (walk, jump, kick), torso actions (bow), combined actions (climb) 2.2.1.1.2. Facial expressions & lip movement, e.g. laugh, fear, say, sing, order 2.2.1.2. Two visual valency (at least one role is human) 2.2.1.2.1. One human and one object (vt. or vi.+instrument) e.g. throw, push, kick, open, eat, drink, bake, trolley 2.2.1.2.2. Two humans, e.g. fight, chase, guide 2.2.1.3. Visual valency ≥ 3 (at least one role is human) 2.2.1.3.1. Two humans and one object (inc. ditransitive verbs), e.g. give, show 2.2.1.3.2. One human and 2+ objects (vt. + object + implicit instr./goal/theme) e.g. cut, write, butter, pocket, dig, cook 2.2.1.4. Verbs without distinct visualisation when out of context: verbs of trying, helping, letting, creating/destroying 2.2.1.5. High level behaviours (routine events), political and social activities e.g. interview, eat out (go to restaurant), go shopping Faculty Research Student Conference Jordanstown, 15 Jan 2004 Level-of-Detail (LOD) basic-level verbs & troponyms EVENT … go cause event level verbs … walk climb limp stride trot swagger run jump manner level verbs jog romp skip bounce hop Faculty Research Student Conference Jordanstown, 15 Jan 2004 troponym level verbs Current status of implementation  Collision detection example (contact verbs: hit, collide, scratch, touch) The car collided with a wall.  using ParallelGraphics’ VRML extension--object-to-object collision  non-speech sound effects  H-Anim examples: 3 visual valency verbs John put a cup of coffee on the table.  H-Anim Site node  locative tags of object (on_table tag for table object) 2 visual valency verbs John pushed the door. John ate the bread. Nancy sat on the chair. 1 visual valency verbs The waiter came to me: “Can I help you? Sir.”  speech modality & lip synchronization  camera direction (avatar’s point-of-view) Faculty Research Student Conference Jordanstown, 15 Jan 2004 Relation to other work  Domain-independent general purpose humanoid character animation  CONFUCIUS’ character animation focuses on language-to-humanoid animation process rather than considering human modelling & motion solely  Implementable semantic representation LVSR connecting linguistic semantics to visual semantics & suitable for action execution (animation)  Categorization and visualisation of eventive verbs based on visual valency  Reusable common sense knowledge base to elicit implied actions, instruments, goals, themes underspecified in language input Faculty Research Student Conference Jordanstown, 15 Jan 2004 Conclusion & Future work  Humanoid animation explores problems in language visualization & automatic animation production  Formalizes meaning of action verbs and spatial prepositions  Maps language primitives with visual primitives  Reusable common senses knowledge base for other systems Further work Prospective applications  Discourse level interpretation  Action composition for simultaneous  Children’s education activities  Verbs concerning multiple characters’ synchronization & coordination (e.g. introduce)  Movie/drama production  Multimedia presentation  Computer games  Virtual Reality Faculty Research Student Conference Jordanstown, 15 Jan 2004
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            