Next: Predicting Text Relevance from
Previous: Handwritten Digits Recognition
At http://lib.stat.cmu.edu/datasets/bodyfat you can find a dataset of various body measurements of 252 men to estimate their percentage of body fat. Accurate measurements of body fat percentage can be obtained by underwater weighting, a very costly technique (nowadays body fat scales and ultra-sound measurements are available, but they are not as accurate). To avoid these costs, the percentage of body fat should be estimated from the given body measurements (like e.g. weight, height, age, ...). Thoroughly analyse the dataset by measuring the information contained in the various features. If necessary reduce the dimensionality by any feature selection method you find appropriate. Make a comparison of different regression techniques on this data with cross-validation. Describe which parameters you chose for the algorithms and how you arrived at these values. Interpret your results and derive simple rules of thumb to estimate the body fat value from the given measurements. Finally perform a cluster analysis on selected attributes and see if you can identify larger groups of people that have particular high / low body fat percentage. Visualize your results.