Please email me by 4 November a 1 page outline for your course project. Include
- a paragraph describing your objective (e.g. optimize a process, investigate the effect of robust preprocessing on several datasets)
- the data you have available (see some potential data sources below),
- if you are simulating the data, let me know what the columns will contain, and how your simulation works in general
- how you plan to use latent variables methods (and possibly other methods) in your data analysis plan
You can always change your project topic after this date, so don't feel locked into a particular area.
These data there are plenty of freely available data sets
- https://kaggle.com contains some excellent, real-world data sets
- Landsat image data
- I collect some smaller teaching datasets on my own website: https://openmv.net - while these won't be suitable for a course project, they can perhaps give you some ideas.
- I have other industrial-scale data sets available, which I can offer, depending on your topic.
You will give a 10 minute presentation about your project's progress to the class on either 9 or 16 December. There will be a 5 minute follow-up of questions by the class.
A printed and PDF project report is due by 9 January 2012 (date may be changed still). The report is to be no more than 25 pages, all inclusive.