EazyML Blog

Easy To Use, Easy To Learn

How To Get Going With Machine Learning In 35 Minutes

July 20, 2020

William Shabecoff

Guest Blogger, Stanford University

As a prospective Math and CS major, I’m very interested in learning machine learning. I decided that making an algorithm that can differentiate between real and fake news headlines would be a fun and simple project. Since my coding chops are still a little bit feeble, I set out to make a model on EazyML.

I went to the scores of datasets on Kaggle looking for an interesting problem to model. I came across this dataset on fake news and real news that piqued my interest: Kaggle Fake News Dataset. The dataset contained tens of thousands of headlines from the past decade sorted into real and fake excel spreadsheets. I have long been interested in the debate over whether platforms like Facebook should remove fake news articles from their websites. I personally feel that Facebook should be doing much more to fight disinformation, because conspiracy theories like Holocaust denial and “Pizzagate” have led to acts of violence Holocaust Denial, Pizzagate. In general, fake news can only poison the political discourse and cover actual societal ills. I wanted to see if I could build a program to tell the difference between fake and real news using machine learning (The impact of fake news). When I started combing through the data, I found noticeable differences between fake news and real news headlines. Fake news headlines where often in all caps, and were accompanied by exclamation points and phrases like “must read”. I realized that EazyML would be able to build an accurate prediction with just the headlines, so I decided to use headlines as my training data.

I sampled about 5,000 headlines from different years as my training data. I sampled from an smattering of time periods so that the finished program would not be biased for or against any specific time period (If I took all of my ‘true’ headlines from the time period when the Affordable Care Act was passed for example, the program would see A.C.A. related terms as corresponding with truth). For each article, I assigned a ‘0’ for untrue, or a ‘1’ for true, and uploaded the data to the EazyML pipeline as a CSV doc.

Once I had uploaded the data, moving through the pipeline to create the model was quite smooth. The entire process was guided by prompts from the EazyML platform. The program asked me which columns were the predictors and which ones were to be predicted — in this case just the ‘title’ column was used to predict a ‘truth’ value of either ‘0’ or ‘1’. When I then chose to make a predictive model, EazyML offered four different predictors I could derive from the headlines: sentiment analysis, GloVe, topic extraction, and concept extraction. I decided to used GloVe which models distributed word representation. GloVe vectors were used detect which words and their relatives were appearing more in fake and real news headlines. Once the GloVe vectors were generated EazyML built several predictive models, the most accurate of which was the “K-Nearest Neighbor” model with 92% accuracy. I was extremely impressed with the resultant predictive model, which could sniff out most fake news headlines as intended. What makes EazyML most impressive is its linear design and use. Despite having little to no background in machine learning, at no point of time during the prediction process was I confused as to the next step. EazyML presents its users with a series of simple choices, which demystifies the machine learning process. With EazyML I was able to build a powerful predictive model, and I was guided through the whole process by a series of simple and clear prompts. If I could change one thing about the platform, I would redesign the user interface, which is functional but a bit-dated. The current interface looks rather outdated and clunky, which belies the program’s modernity and elegance.

After using the platform, I have realized that EazyML can be useful for many people. This newly available technology can help businesses leverage their data to increase profits. In financial services, EazyML can be used to analyze customer transaction and engagement data. In retail, EazyML can look buying trends across geographies timespans and demographics to help businesses anticipate their customers and decide which items should be put on the forefront and how to price inventory. The platform can be used in real estate to predict house prices (House Price Regression), based on either various data points like changes in crime rate and number of rooms or it can predict changes in the value of real estate in a neighborhood by analyzing insurance. EazyML can look at customer data and help gauge future risk. EazyML is especially relevant to those who do not have a background in programming. This platform can provide students and professionals in many fields access to powerful analytic tools that are applicable to nearly all disciplines. Machine learning methods have already begun to be used in economics where computer algorithms can create intricate and comprehensive insights from large volumes of data. EazyML is the best tool on the market for someone looking to learn machine learning quickly without having to acquire a deep background in programming and linear algebra.

While there are a few automated data science tools on the market, my personal opinion is that EazyML is by far the easiest to use and the most accessible to individuals. Tools like DataRobot or RapidMiner can be very powerful in the right hands, but none of these experiences are tailor-made for data-science novices. Trying your hand with EazyML takes almost no effort it’s instantaneous — signing up is free and platform is completely intuitive and easy to use.

I look forward to continuing to learn machine learning, and I am glad I started with EazyML. I will definitely take the time to learn the theory and the math behind machine learning, and eventually I will most likely take the time to build my own models from scratch in order to master the craft.

In the meantime, I will continue to explore the possibilities presented by machine learning with EazyML— an easy to use and powerful tool at my disposal.

Share This Post