Behavioral Digital Twins for Smart Cities Workshop

■Behavioral Digital Twins for Smart Cities Workshop

The 1st Workshop on Behavioral Digital Twins for Smart Cities will be held in conjunction with the IEEE International Conference on Automatic Face and Gesture Recognition 2023 (IEEE FG2023).
The conference will be held in Waikoloa Hawaii, USA, from January 5 to 8, 2023.

■Program

Date: 14:00 to 17:00, January 6, 2023 (half day workshop in the afternoon)
Location: Paniolo III

・Agenda

14:10-14:20 Introduction
14:20-15:20 Keynote speaker session 1: Dr. Angjoo Kanazawa
15:50-16:50 Keynote speaker session 2: Dr. Michael Zollhoefer

■Description

Digital twin is a virtual environment that allows us to simulate real world problems. For digital twin, one of the most important factors is human behavior modeling since it is required by many digital twin applications, such as crime and accident prevention, mitigation strategies in natural disasters, autonomous driving, health coaching and sports simulation. However, because of the complexity of human behavior, there are still many challenges unsolved. To address the challenges, multiple research fields need to be involved together, including computer vision, behavior science, human-computer interaction and AR/VR. This topic is germane to both computer vision and computational behavior communities. In this workshop, we aim to facilitate further discussion on this emerging research field from both technological and application perspectives. The outcomes of this workshop are relevant to building behavioral digital twins of pedestrians in smart cities equipped with sensor networks. During natural disasters, such as the COVID-19 pandemic, modeling behavioral patterns may lead to mitigation policies with greater efficiency and effectiveness, can identify cost-effective solutions to deliver public services, improve government accountability vis-`a-vis citizens and track progress and impact. This workshop will serve as a catalyst to bring diverse stakeholders together so that new scientific languages/thoughts can be established, in an effort to address the societal challenges of creating behavior sensing systems that account for the diversity of people and their environments. In this workshop, the topics of interest include but are not limited to:

Human pose estimation/tracking
Human action recognition
Human pose/action/trajectory forecasting
Visualizations of human trajectory/action
Human-human interaction analysis/forecasting
Human-object interaction detection/forecasting
Multi-sensor fusion for human behavior understanding
Applications of human behavior understanding/forecasting

■Keynote Speaker

・Angjoo Kanazawa

Angjoo Kanazawa image

Angjoo Kanazawa is an Assistant Professor in the Department of Electrical Engineering and Computer Science at the University of California at Berkeley. Her research is at the intersection of Computer Vision, Computer Graphics, and Machine Learning, focusing on the visual perception of the dynamic 3D world behind everyday photographs and video. Previously, she was a research scientist at Google NYC with Noah Snavely, and prior to that she was a BAIR postdoc at UC Berkeley advised by Jitendra Malik, Alyosha Efros, and Trevor Darrell. She completed her PhD in Computer Science at the University of Maryland, College Park with her advisor David Jacobs. She also spent time at the Max Planck Institute for Intelligent Systems with Michael Black. She has been named a Rising Star in EECS and is a recipient of Anita Borg Memorial Scholarship, Best Paper Award in Eurographics 2016, Google Research Scholar Award 2021, Spark award 2022.

Title: Perceiving 3D People in Video

Abstract

We live in a 3D world that is dynamic and full of life, with people interacting with each other and the environment. Capturing this complex world in 3D from everyday images or video has a huge potential for many applications such as novel content creation tool for artists and digital worlds, marker-less motion capture from everyday devices, compelling mixed reality applications that can interact with people and objects, as well as robots that can learn to act by visually observing people and more. In this talk, I will discuss the recent directions my lab has been exploring on the topic of perceiving 3D people and understanding video. I will discuss the latest advances in 3D human perception from challenging video sequences such as those found in story-telling: movies and other edited media as well as videos in-the-wild with many people.

・Michael Zollhoefer

Michael Zollhoefer image

Michael Zollhoefer is a Research Scientist at Reality Labs Research (RLR) in Pittsburgh leading the Completeness Group. His north star is fully immersive remote communication and interaction in the virtual world at a level that is indistinguishable from reality. To this end, he develops key technology that combines fundamental computer vision, machine learning, and graphics research based on a novel neural reconstruction and rendering paradigm. Before joining RLR, Michael was a Visiting Assistant Professor at Stanford University and a Postdoctoral Researcher at the Max Planck Institute for Informatics. He received his PhD from the University of Erlangen-Nuremberg for his work on real-time reconstruction of static and dynamic scenes.

Title: Complete Codec Telepresence

Abstract

Imagine two people, each of them within their own home, being able to communicate and interact virtually with each other as if they are both present in the same shared physical space. Enabling such an experience, i.e., building a telepresence system that is indistinguishable from reality, is one of the goals of Reality Labs Research (RLR) in Pittsburgh. To this end, we develop key technology that combines fundamental computer vision, machine learning, and graphics techniques based on a novel neural reconstruction and rendering paradigm. In this talk, I will cover our advances towards a neural rendering approach for complete codec telepresence that includes metric avatars, binaural audio, photorealistic spaces, as well as their interactions in terms of light and sound transport. In the future, this approach will bring the world closer together by enabling anybody to communicate and interact with anyone, anywhere, at any time, as if everyone were sharing the same physical space.

■Paper Submittion

・Evaluation

All submissions will be double-blind peer-reviewed by at least 3 members of the technical program committee.

・Guidelines

The submitted papers longer than six pages will be subject to an extra fee (100 USD per page). The maximu number of page is eight. All submitting papers have to be in pdf format and the maximum file size is 10MB. All accepted papers will be published as part of IEEE FG2023 proceedings and will be included in IEEE Xplore, therefore, should follow the same paper guidelines of the conference.

・Submittion site

The authors will upload full papers (in the format of FG 2023 main conference) in the CMT submission system.

・Dates

The paper submission and review schedule are below. (The submission date is extended!)

~~September 12, 2022: Submission of full paper~~
~~October 12, 2022: Submission of full paper~~
~~October 19, 2022: Notification of acceptance~~
~~October 31, 2022: Camera-ready full paper~~

■Workshop chairs

Koichiro Niinuma, Fujitsu Research of America, USA
Laszlo A. Jeni, Carnegie Mellon University, USA
Takahisa Yamamoto, Fujitsu Research of America, USA
Ryosuke Kawamura, Fujitsu Limited, Japan