The focus of this research is to develop a novel holistic-based framework to estimate gaze direction. The gaze direction of an infant in a face-to-face interaction is classified as either 1) looking at the parent’s face or 2) looking away from the parent’s face. In our approach, we track facial images in captured videos using an Active Appearance Model (AAM). We obtain an eye patch by cropping the facial image utilizing AAM mesh nodes that surround the eye region as a boundary. We make use of the appearance component of the eye patch as our representation for estimating gaze direction. Despite the huge dimensionality of the visual data, events such as gaze shifting have low dimensions embedded in a large dimensional space. We adopt the spectral regression technique to learn projection functions that map AAM representations into a subspace termed the gaze direction sub-space. Reduced feature points presented in the sub-space are employed to estimate gaze direction based on a Support Vector Machine (SVM) classifier.