project Ml

Project 3 – Ensemble Methods and Unsupervised Learning
In this project you will explore some techniques in unsupervised learning as well as ensemble methods. It is important to realize that understanding an algorithm or technique requires understanding how it behaves under a variety of circumstances. You will go through the process of choosing and exploring two classification datasets, tuning the algorithms you have learned about, writing a thorough analysis of your findings, and presenting your findings. The most crucial part of this assignment is the analysis and your ability to explain and justify your results.
I. Choosing Datasets
The first task in this assignment is choosing two interesting classification datasets, these can be binary or multiclass. The features can be of any type, and it is recommended that you choose datasets with diverse feature sets. I don’t care where you get the data from. You can download some, take some from your own research, or make some up on your own. What I do care about is that the datasets must be interesting. They should contain a decent amount of features and a sufficiently large amount of examples. Do not choose an “easy” dataset, however don’t go crazy either trying to find the perfect one. Your two datasets should also differ in some way such that you can compare and contrast your results between the two. You should also be following standard machine learning practice by splitting your dataset into training and testing, and only touching the testing dataset at the very end when you are ready to report results. (Cross validation is highly recommended).
II. Coding (10%)
After choosing your datasets you will now be tasked with writing code to apply the machine learning algorithms you have learned about. Your code must be written in python, but you may use any libraries that have already implemented the machine learning algorithms (e.g scikit-learn). You are not expected to code the algorithms from scratch, and in fact I would highly discourage it. What you may not do is copy code from the internet. Below are the analyses you are required to run.
1) Run K-means and Hierarchical Clustering on your datasets and analyze what you observe.

2) Run two dimensionality reduction algorithms (PCA and t-SNE) on your datasets. Observe and analyze the results.
3) Re-run the K-means and Hierarchical Clustering on your dimensionality reduced datasets and compare the results to part (1).
4) Tune and train two ensemble models (AdaBoost and Random Forests) on both your original and dimensionality reduced datasets. Compare and analyze the results.
Your code does not have to be pretty or well written. However, it must be written in python and I must be able to run one script ( that will produce all the results and figures in your report.
III. Report (80%)
You will then produce a report describing and analyzing your methods and results. Here you will describe the datasets you have chosen and why they are interesting. You will then provide an analysis on how the different machine learning algorithms performed on each dataset. The report must be limited to 10 pages maximum. Plots and figures are highly recommended. It is up to you how you wish to demonstrate your understanding of the machine learning algorithms you have explored, but below I have listed some potential ideas for analysis and items you may wish to include in the report.
• A description of your two datasets and why you feel that they are interesting. • Hypotheses on how you believe the learning algorithms will perform on each
dataset and why. • How you dealt with different features in your datasets? missing data? different
scalings? • Training and testing error rates you obtained for your various learning
algorithms (some sort of cross validation is highly recommended) • The effect of hyperparameters on performance • Comparing and contrasting results between datasets • Comparing and contrasting results between learning algorithms • Training and testing error rates as a function of training dataset size • Timing analysis of how long it takes to train/test each algorithm • Conclusions • Ideas for future analyses • What you may have done differently • References

You are NOT being graded on how well the algorithms perform on your datasets. What is most important is WHY? You should be explaining and justifying all of your figures and results, and demonstrating that you understand the intricate details of the machine learning process, and the machine leaning algorithms you are using.
IV. Presentation (10%)
Finally you will give a maximum 7 minute presentation of your results (You will be cut off exactly at the 7 minute mark). In this presentation you will describe your datasets, your methods, and any interesting results you found!
What to turn in?
Below is a list of items you will be required to turn in via canvas. Please make sure all documents are named as described bellow.
• report.pdf – Your maximum 10 page report in pdf format. Do not use super tiny or large font. No specific formatting is required but use common sense.
• presentation.pptx or presentation.key – Your presentation slides either in a powerpoint or keynote document.
• – A zip file with all of the code you have written. Within the folder there should a file called README.txt that contains instructions on how to run your code, and a python file called that will produce all figures and plots in your report/presentation. I should be able to reproduce your results easily.
• – A zip file that contains the two datasets you have chosen.
You are being scored on your analysis more than anything else. Roughly speaking, implementing everything and getting it to run is worth very little for this assignment. Of course though, analysis without proof of working code makes the analysis suspect. The key thing is that your explanations should be

both thorough and concise, and your analysis should prove to me you have a deep understanding of the machine learning process and the machine learning algorithms you are using.


(USA, AUS, UK & CA PhD. Writers)


The Best Custom Essay Writing Service

About Our Service

We are an online academic writing company that connects talented freelance writers with students in need of their services. Unlike other writing companies, our team is made up of native English speakers from countries such as the USA, UK, Canada, Australia, Ireland, and New Zealand.

Qualified Writers

  • At, most of our writers are degree-holding native speakers of English who are familiar with various writing styles. Our writers are proficient in many fields, including Economics, Business, Accounting, Finance, Medicine, Chemistry, Literature, Mathematics, Statistics, and many others.
  • Making our customers happy is an important part of our service. So do not be surprised if you get your paper well before the deadline!
  • We pay a lot of attention to ensuring that you get excellent customer service. You can contact our Customer Support Representatives 24/7. When you order from us, you can even track the progress of your paper as it is being written!
  • We are attentive to the needs of our customers. Therefore, we follow all your instructions carefully so that you can get the best paper possible.
  • It matters to us who writes for you, and we are serious about selecting the best candidates.
  • Our writers are always learning something new, so they are familiar with the latest developments in the scientific world and can write papers with updated information and the latest findings.

Our Guarantees:

  • Quality original papers that follow your instructions carefully.
  • On time delivery – you get the paper before the specified deadline.
  • Attentive Customer Support Representatives available 24/7.
  • Complete confidentiality – we do not share you details or papers with anybody else.