Features vs attributes, classes vs labels

Submitted by Xilodyne on Sun, 01/15/2017 - 10:35

Recently reviewing my Naïve Bayes java routine that I wrote last summer I realized that I had mix/matched/confused a number of data and method definitions involving attributes, features, labels, classes, training and prediction. Basing my routine on the description given in Wikipedia, which describes features associated to classes, while at the same time trying to translate the python sklearn into Java, which uses features and labels, led to the mess. Since I've been also been using Weka, it also has it's own terminology: attributes and classes (more clearly defined in the ARFF description).

Category	sklearn	Wikipedia	Weka
Data title	features	features	attributes
Class association	label	class	class
Train method	fit	training	train
Test method	predict	testing	test

It is surprising that there can be so many ways of describing the same thing. The best answer I've found so far comes from a response by Zeeshan Zai, on Quora:

Zeeshan Zia, Research Scientist at NEC Laboratories America
Written Apr 9, 2015

... What is called a "feature" in machine learning or pattern recognition is traditionally called an "attribute" in data mining!

As Udacity uses python for machine learning, I'll stick with feature, label, fit, predict for my java routines.

Features vs attributes, classes vs labels

Tags