A biped robot is a type of walking robot which mimics the locomotion of a human being. It can be employed in irregular structures which are mostly found in real-time. Mostly, it found applications that are related to disaster management relief, muddy places, nuclear power plants, structures build for humans, etc. However, controlling biped robots is tougher than humans because of the structure. Since humans can vary the center of pressure (CoP) for stable walking with the help of the upper body, hands, and leg muscles. Whereas the biped robot consists of the lower part that includes the joint motors only for varying the CoP.
The kinematic structure and dynamics of a biped robot are very complex due to multiple joint and floating base. Therefore, it is a big challenge to produce the closed-loop trajectories for multiple tasks such as walking on flat and uneven terrain, dancing, logistic support in the industry, etc. Though, researchers in past have developed various methodologies for the generation of closed-loop trajectories. However, the biped robot failed to adapt to new situations that it encountered in the real world, which can be hazardous. Therefore, there is a requirement of the learning framework which can learn in real-time. Additionally, the framework should be data-efficient as well. Fig. 1 presents the pictorial representation of the adaptive learning framework.
The presented framework has the potential to become the first step for artificial generalized intelligence in the biped robot. However, the framework has a lot of challenges: (a) structure of the reward function in real-time, (b) how to incorporate the additional parameters for the regulation of the policy structure and, (c) efficient training algorithm.