The body of literature on classification methods which estimate boundaries between the groups (classes) by optimizing a function of the Lp-norm distances of observations in each group from these boundaries, is maturing fast. The number of published research articles on this topic, especially on mathematical programming (MP) formulations and techniques for Lp-norm classification, is now sizable. This paper highlights historical developments that have defined the field, and looks ahead at challenges that may shape new research directions in the next decade. In the first part, the paper summarizes basic concepts and ideas, and briefly reviews past research. Throughout, an attempt is made to integrate a number of the most important Lp-norm methods proposed to date within a unified framework, emphasizing their conceptual differences and similarities, rather than focusing on mathematical detail. In the second part, the paper discusses several potential directions for future research in this area. The long-term prospects of Lp-norm classification (and discriminant) research may well hinge upon whether or not the channels of communication between on the one hand researchers active in Lp-norm classification, who tend to have their roots primarily in the decision sciences, the management sciences, computer science and engineering, and on the other hand practitioners and researchers in the statistical classification community, will be improved. This paper offers potential reasons for the lack of communication between these groups, and suggests ways in which Lp-norm research may be strengthened from a statistical viewpoint. The results obtained in Lp-norm classification studies are clearly relevant and of importance to all researchers and practitioners active in classification and discriminant analysis. The paper also briefly discusses artificial neural networks, a promising non-traditional method for classification which has recently emerged, and suggests that it may be useful to explore hybrid classification methods that take advantage of the complementary strengths of different methods, e.g., neural network and Lp-norm methods.