32 lines
3.4 KiB
Plaintext
32 lines
3.4 KiB
Plaintext
start end text
|
|
0 6480 Machine learning. Teach a computer how to perform a task, without explicitly programming it to perform said task.
|
|
6620 13420 Instead, feed data into an algorithm to gradually improve outcomes with experience, similar to how organic life learns.
|
|
13580 20400 The term was coined in 1959 by Arthur Samuel at IBM, who was developing artificial intelligence that could play checkers.
|
|
20540 26880 Half a century later, and predictive models are embedded in many of the products we use every day, which perform two fundamental jobs.
|
|
26880 32040 One is to classify data, like "Is there another car on the road?" or "Does this patient have cancer?"
|
|
32040 38600 The other is to make predictions about future outcomes, like "Will this stock go up?" or "Which YouTube video do you want to watch next?"
|
|
38600 43280 The first step in the process is to acquire and clean up data. Lots and lots of data.
|
|
43480 47780 The better the data represents the problem, the better the results. Garbage in, garbage out.
|
|
47900 52160 The data needs to have some kind of signal to be valuable to the algorithm for making predictions.
|
|
52160 59920 And data scientists perform a job called feature engineering to transform raw data into features that better represent the underlying problem.
|
|
60240 64240 The next step is to separate the data into a training set and testing set.
|
|
64460 71800 The training data is fed into an algorithm to build a model, then the testing data is used to validate the accuracy or error of the model.
|
|
71980 77700 The next step is to choose an algorithm, which might be a simple statistical model like linear or logistic regression,
|
|
77940 81260 or a decision tree that assigns different weights to features in the data.
|
|
81260 86640 Or you might get fancy with a convolutional neural network, which is an algorithm that also assigns
|
|
86640 91300 weights to features, but also takes the input data and creates additional features automatically.
|
|
91640 96300 And that's extremely useful for datasets that contain things like images or natural language,
|
|
96420 99020 where manual feature engineering is virtually impossible.
|
|
99260 103960 Every one of these algorithms learns to get better by comparing its predictions to an error function.
|
|
104160 109840 If it's a classification problem, like "Is this animal a cat or a dog?" the error function might be accuracy.
|
|
109840 115900 If it's a regression problem, like "How much will a loaf of bread cost next year?" then it might be mean absolute error.
|
|
116220 121780 Python is the language of choice among data scientists, but R and Julia are also popular options,
|
|
121920 125320 and there are many supporting frameworks out there to make the process approachable.
|
|
125500 132680 The end result of the machine learning process is a model, which is just a file that takes some input data in the same shape that it was trained on,
|
|
132860 136900 then spits out a prediction that tries to minimize the error that it was optimized for.
|
|
136900 141980 It can then be embedded on an actual device or deployed to the cloud to build a real-world product.
|
|
142180 144500 This has been Machine Learning in 100 Seconds.
|
|
144580 147160 Like and subscribe if you want to see more short videos like this,
|
|
147320 150500 and leave a comment if you want to see more machine learning content on this channel.
|
|
150620 153040 Thanks for watching, and I will see you in the next one.
|