Can a robot make music that moves you? Can social media help us understand the effectiveness of anti-depressant drugs? Can artificial intelligence predict stock markets?
These questions and many others are being answered by NYU Shanghai students and faculty studying and using the powerful and rapidly-evolving tools of data science.
Every day, billions of Internet-connected devices and services are generating gigabytes of raw data – from what and when people are buying goods, to where they are traveling and how long they are staying, to what topics they are researching at any given moment in time. Never before has humankind been able to amass so much raw data. Never before has humankind had the power to process it.
Over the past decade, computer scientists, mathematicians, statisticians, and other experts from diverse fields have drawn techniques, such as machine learning algorithms and artificial intelligence methods, as well as specific domain knowledge, to make sense out of the information chaos. The methodology and discipline they’ve developed – data science – uses mathematics, statistics, and computer science to analyze massive amounts of data collected in fields as diverse as transportation, marketing, anthropology, political science, music and literature – and human behavior.
The insights gained from data science are changing the way we work, play, and socialize, and are also driving innovation in almost all industries, says Dean of Engineering and Computer Science Keith Ross.
“Whether you’re working in marketing, R&D, or the finance industry, knowledge about data science is now mandatory in almost every industry,” says Assistant Professor Guo Li, who joined the NYU Shanghai faculty after stints as a data scientist for Kelley Blue Book and Taobao.
Enric Junqué de Fortuny, Assistant Professor of Information Systems and Business Analytics, focuses on modeling human-generated data and its applications to business.
“Businesses need to make thousands of decisions every day. Acme marketing wants to predict who is going to be interested in its products. A diabetes app will want to do pro-active disease progression prognosis, and a social media company for teenagers needs to know whether someone is lying about their age,” says Assistant Professor of Information Systems and Business Analytics Enric Junqué de Fortuny. “These scenarios require making decisions and judgments about humans in environments with many unknowns. Data science reduces uncertainty in such decision-making processes. The promise is that as we complete the (responsible) digital transformation of business and society, consumers will ultimately benefit by getting superior services and products.”
A world of possibilities, sophisticated coursework
It’s no surprise, then, that data science has become NYU Shanghai’s fastest-growing major. Since its introduction in 2016, more than one in ten students have declared data science as a primary or secondary major. The university is one of the first liberal arts-centered universities to offer a major in data science at the undergraduate level, Ross says.
The data science program is built on a recognition of the field’s interdisciplinary nature. It’s simply not enough to master the underlying statistical methodologies and algorithms – data scientists must also master the field to which they are applying the data tools. With nine sub-concentrations such as genomics, artificial intelligence, and economics available to data science majors, NYU Shanghai offers one of the most flexible and wide-ranging undergraduate data science programs in the world. To support this interdisciplinary program, NYU Shanghai currently has 12 data science core faculty members, with expertise not only in computer science and mathematics, but also in application domains such as finance, urban planning, and neuroscience, with plans to add new faculty members in social science and data science methodology.
Data science major Kelly Marshall ’20 spent his summer vacation working with Dean of Engineering and Computer Science Keith Ross on new algorithms for reinforcement learning.
Kelly Marshall ’20, a data science major specializing in AI, says that the ability to go in-depth within the field you want to apply data science methodology is part of what attracted him to the major. Marshall, who is also minoring in Chinese and mathematics, says the two humanities classes he’s been able to take along with his data science courses every semester have only strengthened his performance in his major.
“If you see a data science project go terribly wrong, the problem is usually that the designers are trusting computer predictions too much, and they’re not applying any sort of domain knowledge whatsoever,” he says. “So it’s really important that our data science major includes this emphasis on knowing the subject where you’re going to be applying your analysis.”
Last summer, Marshall worked closely with Ross writing new algorithms for deep reinforcement learning, an exciting branch of AI that uses neural networks and informed trial and error to find good strategies for sequential decision problems, such as in the control of robots or playing the game Go. Typically, deep reinforcement learning isn't taught at the undergraduate level. But because many NYU Shanghai students have taken advanced courses in probability theory and computer science, they are ready for such graduate-level topics. “Even our introductory machine learning course includes topics that are normally taught at the graduate level,” says Ross.
Students say faculty members are supportive – encouraging them to advance as quickly as their interest and ability take them. “I took part in an independent study group on deep learning that Professor Gus Xia organized in my sophomore year. He really piqued my curiosity in class by showing us these incredible videos of his own work with robots and music, and then telling us, ‘You can get started learning how to do this right now,’” Marshall says. “[He] made some really complex concepts tangible and accessible, and it was amazing to start working closely with a professor so early in college. That was an experience I never really anticipated having as an undergrad.”
Science without Boundaries
The cross-disciplinary nature of NYU Shanghai’s data science program means that faculty research touches almost every facet of the human experience.
Bruno Abrahao, Assistant Professor of Information Systems and Business Analytics discussing a data science project with students from his Network Analytics class.
Bruno Abrahao, Assistant Professor of Information Systems and Business Analytics, joined researchers from Harvard, Georgia Tech, and Microsoft to analyze millions of Twitter posts to identify the long-term effects of psychiatric medications on patients. Using a natural language processing approach, the team identified Twitter users who were taking psychiatric medications and created an artificial intelligence model to find patterns in these users’ language. Their work received the Outstanding Study Design Award by the Web and Social Media Conference, held by the Association for the Advancement of Artificial Intelligence. They found that posts by Twitter users who self report taking an antidepressant medication showed significant, consistent differences in emotional and cognitive outcomes related to their language before and after they started taking them. “Our study showed that in the future, healthcare providers can use this kind of social media data analysis to improve their real treatment choices and provide more individualized medicine,” Abrahao says.
Assistant Professor of Computer Science, Gus Xia, spoke on music intelligence, creativity, and whether artificial intelligence can make people more creative at a panel on “Creativity” at Zaojiu Talk in April 2019, a Chinese TED-style idea-sharing platform.
Meanwhile, in the Music X Lab at NYU Shanghai, Assistant Professor of Computer Science Gus Xia is training AI musicians to perform collaboratively alongside human counterparts. Using a type of algorithm called representation learning, Xia’s AI identifies an individual musician’s patterns of notes, tones, volumes, and tempos and compares them with the many many millions of combinations of notes, timing, and structures embedded in music. Then it produces an accompaniment that fits the human musician’s own style. Xia’s AI is even capable of improvising, leading its human collaborator in new musical directions.
Xia, a musician himself, hopes that his work will help more people appreciate, perform, and compose music. “It’s the perfect way to bridge humanity and technology,” Xia says.
A “haptic flute” designed by Xia and his students to provide more flexible tutoring methods based on the proficiency of the player.
Enric Junqué de Fortuny, Assistant Professor of Information Systems and Business Analytics, has focused on modeling human-generated data and its applications to business. “In one study, we analyzed fine-grained behavioral data of millions of subjects and were then tasked with making accurate predictions about those same subjects: Whether they are satisfied with their life, their political inclinations, or whether they would be prone to substance abuse in the nearby future. As it turns out, we can predict such seemingly ephemeral things with surprising accuracy!”
And Guo Li is looking into how statistical analysis and deep learning can be combined to advance facial recognition technology: “I’m working on predicting age and gender based on face images. Since age prediction is an ordinal regression task, it's possible to combine the ordinal regression techniques in traditional statistics with deep learning to see if we can improve the current precision of predicting age and gender.”
Assistant Professor Guo Li joined the NYU Shanghai faculty in 2019 after stints as a data scientist for Kelley Blue Book and Taobao.
Transformative technology in an evolving city
For faculty and students alike, it’s not just what’s happening on campus that is most exciting about studying data science at NYU Shanghai. The city of Shanghai itself is an incubator for both raw data and advances in the discipline. “There is a lot of investment and interest in AI companies in China, and the start-up scene is booming here. At the same time, the flow of exchange between data sources, academia, and industry is more flexible here than in other parts of the world,” Abrahao says.
NYU Shanghai has already found several opportunities to collaborate with high-profile Shanghai-based companies, most recently in the April 2019 Hack the Pearl competition, sponsored by HSBC China (HBCN). Student teams were challenged to leverage real data provided by HBCN to develop models predicting consumers’ likelihood of purchasing investment products. Their answers were compared to real outcomes revealed by HBCN, and winners could land internships at the company.
This kind of real-world experience is invaluable and blossoms in a vibrant city like Shanghai, said Junqué de Fortuny, who served as a judge and advisor in the competition.
“Shanghai, specifically, has transitioned from being a financial and shipping hub to becoming a global technology powerhouse in a matter of mere years. From robo-cocktail bars to AI chip design studios and self-driving vehicles, you name it, and you'll find it being developed somewhere here,” Junqué de Fortuny says. “And the only way to truly experience the excitement and pace of it all is to be a part of it. Working at NYU Shanghai has opened my eyes in terms of the impact China’s digital transformation is having on its citizens.”