Even though many approaches have been proposed for Human Action Recognition, challenges like illumination variation, occlusion, camera view and background clutter keep this topic open for further research. Devising a robust descriptor for representing an action to give good classification accuracy is a demanding task. In this work, a new feature descriptor is introduced which is named ‘Spatio Temporal Shape-Texture-Motion’ (STSTM) descriptor. STSTM feature descriptor uses hybrid approach by combining local and global features. Salient points are extracted using Spatio Temporal Interest Points (STIP) algorithm which are further encoded using Discrete Wavelet Transform (DWT). DWT coefficients thus extracted represent local motion information of the object. Shape and texture features are extracted using Histogram of Oriented Gradient (HOG) and Local Binary Pattern (LBP) algorithms respectively. To achieve dimensionality reduction, Principal Component analysis is applied separately to three types of features. Feed Forward Neural Network is employed to perform the classification. The proposed algorithm is extensively tested using well known KTH, Weizmann and UCF sports datasets. The performance of proposed method is found to be better than many methods mentioned in the literature.