From Perceptron to Cognition: Towards Next Generation Intelligent Agent
Date:
Over the last decades, ML research has been featured with developing isolated systems to solve task-specific problems. The systems themselves are running only at the “perceptron” level: they can perceive input signals, predict the output labels by learning feature mappings from the training data, but lack the mental process to understand the physical world that the task is grounded on. Recently, the outburst of NLP applications powered by large language models (LLMs) have demonstrated great potentials towards understanding the world at the “cognition” level to follow general human requests. This talk will provide a summary of my previous research works on improving language model techniques and making them cognitively more intelligent. They can be categorized into 4 topics: (1) Teach language model to think using latent-variable models; (2) Teach language model to generalize using few-shot learning; (3) Teach language model to specialize based on task-specific requirements; and (4) Productionize language models. I will introduce the key ideas for every topic, summarize the challenges, and finally conclude with the vision for future intelligent systems and suggested future research directions.